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5 TISSUE SPECIFIC EXPRESSION OF RETINOBLASTOMA 

PROTEIN 

BACKGROUND OF THE INVENTION 

10 Both the retinoblastoma gene (RB) and transcription 

factor E2F play a critical role in cell growth control (for a 
review, see Adams, P. & Kaelin, W. seminars in Cancer Biology 
6:99-108 (1995)). The RB locus is frequently inactivated in a 
variety of human tumor cells. Reintroduction of a wild-type 
15 RB gene (e.g., Bookstein et al. Science 247:712-715 (1990)) or 
RB protein (pRB) (e.g., Antelman et al. Oncogene. 10:697- 
704(1995)) into RBneg/RBmut cells can suppress growth in 
culture and tumorigenicity in vivo. 

While E2F serves to activate transcription of S- 
20 phase genes, its activity is kept in check by RB. RB arrests 
cells by blocking exit from G into S-phase (for example, Dowdy 
et al. Cell 73:499-511 (1993)) but the precise pathway of the 
arrest remains unclear. 

Although E2F forms complexes with RB, complex 
25 formation is more efficient if an E2F-related protein, DP-1, 
is present. E2F-1 and DP-1 form stable heterodimers which 
bind to DNA (for example, Qin et al . f^n^F an<3 Dev« 6-: 953 -964 
(1992)). DP-1-E2F complexes serve to cooperatively activate 
transcription of E2F-dependent genes. Such transcription can 
30 be repressed by pRB in the same manner as E2F-1 or DP-1 
activated transcription. 

Transcriptional repression of genes by RB in some 
instances can be achieved by tethering pRB to a promoter. For 
example, GAL4 -pRB fusions bind to GAL4 DNA binding domains and 
35 repress transcription from p53, Sp-1 or AP-1 elements (Adnane, 
et al. ,T. Riol . Chem. 270:8837-8843 (1995); Weintraub, et al. 
Ha£U££ 358:259-261 (1995)). Sellers, et al. ( ?TQr r , Nat l . 
Acad. Sci. 92:11544-11548 (1995)) disclosed fusions of amino 
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acid residues 1-368 of E2F with amino acids 379-792 or 379-928 
of RB. 

Chang, et al . ( Science 267:518-521 (1995)) disclosed 
the use of a replication-defective adenovirus-RB construct in 
5 the reduction of neointima formation in two animal models of 
restenosis, a hyperprolif erat ive disorders. 

SUMMARY OF THE INVENTION 

The instant invention provides the surprising result 

10 that a fusion of an E2F polypeptide with an RB polypeptide is 
more efficient in repressing transcription of the E2F promoter 
than RB alone, and that such fusions can cause cell cycle 
arrest in a variety of cell types. Such fusions can thus 
address the urgent need for therapy of hyperprolif erative 

15 disorders, including cancer. 

One aspect of the invention is a polypeptide 
comprising a fusion of a transcription factor, the 
transcription factor comprising a DNA binding domain, and a 
retinoblastoma (RB) polypeptide, the RB polypeptide comprising 

20 a growth suppression domain. Another aspect of the invention 
is DNA encoding such a fusion polypeptide. The DNA can be 
inserted in an adenovirus vector. 

In some embodiments of the invention, the 
transcription factor is E2F. The cyclin A binding domain of 

25 the E2F can be deleted or nonfunctional. The E2F can comprise 
amino acid residues about 95 to about 194 or about 95 to about 
286 in some embodiments. 

The retinoblastoma polypeptide can be wild- type RB, 
RB56, or a variant or fragment thereof. In some embodiments, 

3 0 the retinoblastoma polypeptide comprises amino acid residues 

of about 379 to about 928. Preferred amino acid substitutions 
of the RB polypeptide include residues 2, 608, 788, 807, and 
811. 

Another aspect of the invention is an expression 
35 vector comprising DNA encoding a polypeptide, the polypeptide 
comprising a fusion of a transcription factor, the 
transcription factor comprising a DNA binding domain, and a 
retinoblastoma (RB) polypeptide, the RB polypeptide comprising 
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a growth suppression domain. In some embodiments a tissue - 
specific promoter is operatively linked to DNA encoding the 
fusion polypeptide. The tissue-specific promoter can be a 
smooth muscle alpha act in promoter. 

Another aspect of the invention is a method for 
treatment of hyperprolif erative disorders comprising 
administering to a patient a therapeutically effective dose of 
an E2F-RB fusion polypeptide. The hyperprolif erative disorder 
can be cancer. In some embodiments the hyperprolif erative 
disorder is restenosis. The fusion polypeptide and nucleic 
acid encoding the fusion polypeptide can be used to coat 
devices used for angioplasty. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1A depicts the predicted amino acid sequence 

of E2F . 

Figure IB depicts the nucleotide sequence of 
transcription factor E2F. 

Figure 2A depicts the nucleotide sequence of pRB as 
disclosed by Lee, et al . ( Nature 329:642-645 (1987). 

Figure 2B depicts the predicted amino acid sequence 

of pRB. 

Figure 3 is a diagrammatic representation of pCTM. 
Figure 4 depicts the nucleotide sequence of plasmid 

pCTM. 

Figure 5 is a diagrammatic representation of pCTMI . 
Figure 6 depicts the nucleotide sequence of pCTMI . 
Figure 7 is a diagrammatic representation of plasmid 

pCTMIE. 

Figure 8 depicts the nucleotide sequence of pCTMIE. 

Figure 9 is a diagram depicting E2F-RB fusion 
constructs used in the examples. All E2F constructs commenced 
at amino acid 95 and lacked part of the cyclin A binding 
domain. E2F-437 contained the DNA binding domain (black) , 
heterodimerization domain (white) , and the transactivation 
domain (stippled) . E2F-194 contained solely the DNA binding 
domain. E2F-286 contained the DNA binding domain and the DP-1 
heterodimerization domain. To generate E2F-194 -RB56-5s and 
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E2F-286-RB56-5s, the E2F constructs were fused in-frame to 
codon 379 of RB . C706F is an inactivating point mutation. 

Figure 10 is a diagram depicting transcriptional 
repression by E2F-RB fusion constructs. 
5 Figure 11 (A-D) depicts expression of E2F-RB fusion 

proteins in mammalian cell lines. Extracts were prepared from 
cells used in E2-CAT reporter assays or in FACS assays and 
analyzed with an anti-RB monoclonal antibody. In panel A, the 
results are shown from C33A cells transfected with (3) RB56- 
10 H209, (4) RB56 wild-type, (5) RB56-5S, (6) E2F286-5S, (7) 

E2F194-5S, (8) E2F194, (9) E2F286, (10) E2F437. Lane (1) is 
an RB56 protein standard. Lane (2) is a mock transf ection . 
In panel B, results are shown for transfection of Saos-2 cells 
with (1) RB56, (2,3) E2F194-5s, and (4,5) E2F286-5s. In panel 
15 C, results are shown for transfection of 5637 cells with (2,3) 
RB56 wild-type, (4,5) RB56-5S; (6,7) E2F194-5S; (7,8) E2F286- 
5S. Lane (1) is an RB56 protein standard. In panel D, 
results are shown for NIH-3T3 transfected (3) RB56, (4) 
E2F286-5S, (5) E2F194-5s. Lane (1) is an RB56 standard; lane 
20 (2) is an RB110 standard. 

Figure 12 depicts histogram analyses of flow 
cytometry of RB-expressing NIH-3T3 cells. 

Figure 13, panel A, depicts a comparison of the 
effects of a CMV-driven recombinant adenovirus (ACN56) with 
25 two isolates of a human smooth muscle alpha actin-driven E2F- 
p56 fusion construct consisting of amino acids 95 through 286 
of E2F linked directly and in-frame to p56 (amino acids 379- 
928 of RB cDNA) , vs. a control virus (ACN) in a 3 H-thymidine 
uptake assay in the rat smooth muscle cell line A7R5 . Panel 
30 (B) depicts the effects of the same constructs in the rat 
smooth muscle cell line A10. 

Figure 14 depicts a comparison of the effects of the 
viruses described in Fig. 13 in non-muscle cells. Panel (A) 
depicts results in the breast carcinoma cell line MDA MB468. 
35 Panel (B) depicts results in the non-small cell lung cell 
carcinoma line H3 58. 

Figure 15, top panel, depicts the relative 
infect ivity by adenovirus of different cell lines as judged by 
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the level of P-galactosidase (P-gal) staining following 
infection with equal amounts of a recombinant adenovirus 
expressing p-gal driven by a CMV promoter. H358 is non- small 
lung cell carcinoma cell line; MB468 is a breast carcinoma 
5 cell line; A7R5 and A10 are smooth muscle cell lines. The 

lower portion of the figure depicts the relative levels of p56 
protein expressed in the same cells when infected with the 
recombinant adenovirus ACN56 , in which the p56 cDNA is driven 
by the non- tissue specific CMV promoter. 
10 Figure 16 depicts relative protein levels in cells 

infected with the smooth muscle alpha actin promoter-driven 
E2F-p56 fusion construct (ASN286-56) . UN denoted uninfected; 
50, 100, 250, and 500 refer to multiplicities of infection 
(MOD . 

15 Figure 17 is a bar graph depicting the ratio of 

intima to media area (as a measurement of the inhibition of 
neointima formation) from cross-sections (n=9) of rat carotid 
arteries which were injured and treated with recombinant 
adenoviruses expressing either P-gal, RB (ACNRB) or p56 

2 0 (ACN56) , all under the control of the CMV promoter. 

Figure 18 is a series of three photographs depicting 
restenosis in a rat angioplasty model. The panel on the left 
depicts data from a normal animal; the central panel depicts 
data from an animal injured and then treated with a p-gal 

25 expressing recombinant virus; the panel on the right depicts 
data from an animal injured and then treated with a 
recombinant adenovirus expressing p56 (ACN56) . 

Figure 19 depicts tissue-specificity of the smooth 
muscle alpha actin promoter, as demonstrated by its selective 

30 ability to express the p-gal transgene in muscle cells but not 
non-muscle cells. The panels on the left compare P-gal 
expression in the breast cell carcinoma line MB468 infected 
with either an MOI=l with a CMV-driven P-gal (ACNBGAL) vs an 
MOI= 100 with the smooth muscle promoter construct (ASNBGAL) . 
3 5 The panels on the right show P-gal expression of the rat 

smooth muscle cell line A7R5 infected with either an M0I=1 of 
ACNBGAL or an MOI = 50 of ASNBGAL. Expression from ASNBGAL is 
seen in the muscle cell line, but is absent in the non-muscle 
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cell line, despite the higher degree of infect ivity of the 
cells . 

Figure 2 0 depicts the ability of recombinant 
adenovirus expressing RB to transduce rat carotid arteries. 
5 recombinant adenovirus -treated arteries (IX 10 9 pfu) were 
harvested two days following balloon injury and infection. 
Cross sections were fixed and an RB specific antibody was used 
to detect the presence of RB protein in the tissue. The 
control virus used was ACN. RB protein staining was evident 
10 in the ACNRB treated sample, especially at higher 
magnifications . 

Figure 21 depicts a comparison of the effects of a 
CMV-driven p56 recombinant adenovirus (ACN56E4) vs a human 
smooth muscle alpha-actin promoter-driven E2F-p56 fusion 
15 construct (ASN286-56) vs control adenoviral constructs 
containing either the CMV or smooth muscle alpha-actin 
promoters without a downstream transgene (ACNE 3 or ASBE3-2 
isolates shown, respectively) . Assays were 3 H-thymidine 
uptake either in a smooth muscle cell line (A7R5) or a non- 
20 muscle cell line (MDA-MB46 8 , breast carcinoma). Results 

demonstrated muscle tissue specificity using the smooth muscle 
alpha-actin promoter and specific inhibition by both the p56 
and E2F-p56 transgenes relative to their respective controls. 

25 DESCRIPTION OF THE PREFERRED EMBODIMENT 

The instant invention provides RB fusion constructs 
including fusion polypeptides and vectors encoding them, and 
methods for the use of such constructs in the treatment of 

30 hyperproliferative diseases. In some preferred embodiments of 
the invention, an RB polypeptide is fused to an E2F 
polypeptide. Any E2F species can be used, typically E2F-1, - 
2, -3, -3, or -5 (see, e.g., Wu et al. Mol Cell. Biol . 
15:2536-2546 (1995); Ivey-Hoyle et al . Mol » Cell. Biol . 

35 13:7802 (1993); Vairo et al . r 7 *n^ and Pev , 9:869 (1995); 

Beijersbergen et al . c^n^s and Dev. 8:2680 (1994)); Ginsberg 
et «i ' npnpg and Dev. 8:2665 (1994); Buck et al . Oncogene 
11:31 (1995)), more typically E2F-1. Typically, the EF2 
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polypeptide comprises at least the DNA binding domain of E2F, 
and may optionally include the cyclin A binding domain, the 
heterodimerization domain, and/or the transactivation domain. 
Preferably, the cyclin A binding domain is not functional. 
5 The nucleotide and amino acid sequence of E2F referred to 

herein are those of Genbank HUME2F, shown in Figure 1A and IB. 
Nucleic acid, preferably DNA, encoding such an EF2 polypeptide 
is fused in reading frame to an RB polypeptide. The RB 
polypeptide can be any RB polypeptide, including conservative 

10 amino acid variants, allelic variants, amino acid 

substitution, deletion, or insertion mutants, or fragments 
thereof. Preferably, the growth suppression domain, i.e., 
amino acids residues 379-928, of the RB polypeptide is 
functional (Hiebert, et al . MCS 13:3384-3391 (1993); Qin, et 

15 al. Genes and Dev. 6:953-964 (1992)). In some embodiments, 
wild- type pRBHO is used. More preferably, a truncated 
version of RB, RB56, is used. RB56 comprises amino acid 
residues 379-928 of pRBHO (Hiebert, et al . MCB 13 :3384-3391 
(1993); Qin, et al . Genes and Dev. 6:953-964 (1992)). In some 

20 embodiments, amino acid variants of RB at positions 2, 608, 

612, 788, 807, or 811, are used singly or in combination. The 
variant RB56-5S comprises wild- type RB56 having alanine 
substitutions at 608, 612, 788, 807, and 811. Numbering of RB 
amino acids and nucleotides is according to the RB sequence 

25 disclosed by Lee, et al . ( Nature 329:642-645 (1987)), hereby 
incorporated by reference in its entirety for all purposes. 
(Figure 2) . 

Nucleic acids encoding the polypeptides of the 
invention can be DNA or RNA. The phrase "nucleic acid 

30 sequence encoding" refers to a nucleic acid which directs the 
expression of a specific protein or peptide. The nucleic acid 
sequences include both the DNA strand sequence that is 
transcribed into RNA and the RNA sequence that is translated 
into protein. The nucleic acid sequences include both the 

35 full length nucleic acid sequences as well as non-full length 
sequences derived from the full length protein. It is further 
understood that the sequence includes the degenerate codons of 
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the native sequence or sequences which may be introduced to 
provide codon preference in a specific host cell. 

The term "vector" as used herein refers to viral 
expression systems, autonomous self -replicating circular DNA 
5 (plasmids) , and includes both expression and nonexpression 

plasmids . Where a recombinant microorganism or cell culture 
is described as hosting an "expression vector, " this includes 
both extrachromosomal circular DNA and DNA that has been 
incorporated into the host chromosome (s) . Where a vector is 
10 being maintained by a host cell, the vector may either be 

stably replicated by the cells during mitosis as an autonomous 
structure, or is incorporated within the host 1 s genome. A 
vector contains multiple genetic elements positionally and 
sequentially oriented, i.e., operatively linked with other 
15 necessary elements such that nucleic acid in the vector 

encoding the constructs of the invention can be transcribed, 
and when necessary, translated in transfected cells. 

The term "gene" as used herein is intended to refer 
to a nucleic acid sequence which encodes a polypeptide. This 
20 definition includes various sequence polymorphisms, mutations, 
and/or sequence variants wherein such alterations do not 
affect the function of the gene product. The term "gene" is 
intended to include not only coding sequences but also 
regulatory regions such as promoters, enhancers, and 
25 termination regions. The term further includes all introns 
and other DNA sequences spliced from the mRNA transcript, 
along with variants resulting from alternative splice sites. 

The term "plasmid" refers* to an autonomous circular 
DNA molecule capable of replication in a cell, and includes 
30 both the expression and nonexpression types. Where a 

recombinant microorganism or cell culture is described as 
hosting an "expression plasmid", this includes both 
extrachromosomal circular DNA molecules and DNA that has been 
incorporated into the host chromosome (s) . Where a plasmid is 
35 being maintained by a host cell, the plasmid is either being 

stably replicated by the cells during mitosis as an autonomous 
structure or is incorporated within the host's genome. 
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The phrase "recombinant protein" or "recombinantly 
produced protein" refers to a peptide or protein produced 
using non-native cells that do not have an endogenous copy of 
DNA able to express the protein. The cells produce the 
5 protein because they have been genetically altered by the 

introduction of the appropriate nucleic acid sequence. The 
recombinant protein will not be found in association with 
proteins and other subcellular components normally associated 
with the cells producing the protein. The terms "protein" and 

10 "polypeptide" are used interchangeably herein. 

In general, a construct of the invention is provided 
in an expression vector comprising the following elements 
linked sequentially at appropriate distances for functional 
expression: a tissue-specific promoter, an initiation site for 

15 transcription, a 3' untranslated region, a 5' mRNA leader 
sequence, a nucleic acid sequence encoding a polypeptide of 
the invention, and a polyadenylation signal. Such linkage is 
termed "operatively linked." Enhancer sequences and other 
sequences aiding expression and/or secretion can also be 

20 included in the expression vector. Additional genes, such as 
those encoding drug resistance, can be included to allow 
selection or screening for the presence of the recombinant 
vector. Such additional genes can include, for example, genes 
encoding neomycin resistance, multi-drug resistance, thymidine 

25 kinase, beta-galactosidase , dihydrof olate reductase (DHFR) , 
and chloramphenicol acetyl transferase. 

In the instant invention, tissue-specific expression 
of the RB constructs of the invention is preferably 
accomplished by the use of a promoter preferentially used by a 

30 tissue of interest. Examples of tissue-specific promoters 

include the promoter for creatine kinase, which has been used 
to direct the expression of dystrophin cDNA expression in 
muscle and cardiac tissue (Cox, et al . Nature 364:725-729 
(1993)) and immunoglobulin heavy or light chain promoters for 

35 the expression of suicide genes in B cells (Maxwell, et al. 
Cancer Res. 51:4299-4304 (1991)). An endothelial cell- 
specific regulatory region has also been characterized 
(Jahroudi, et al. MoT. Cell. Biol. 14:999-1008 (1994)). 
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Amphotrophic retroviral vectors have been constructed carrying 
a herpes simplex virus thymidine kinase gene under the control 
of either the albumin or alpha- fetoprotein promoters (Huber, 
et al. Proc. Natl. Acad. Sci. U.S.A. 88:8039-8043 (1991)) to 
5 target cells of liver lineage and hepatoma cells, 

respectively. Such tissue specific promoters can be used in 
retroviral vectors (Hartzoglou, et al . J. Biol. Chem. 
265:17285-17293 (1990)) and adenovirus vectors (Friedman, et 
al. Mol. Cell. Biol. 6:3791-3797 (1986); Wills et al. CflUCer 
10 Gene Therapy 3:191-197 (1995)) and still retain their tissue 
specificity. 

In the instant invention, a preferred promoter for 
tissue-specific expression of exogenous genes is the human 
smooth muscle alpha-actin promoter. Reddy, et al. (J. Cell 

15 Biology 265:1683-1687 (1990)) disclosed the isolation and 
nucleotide sequence of this promoter, while Nakano, et al . 
( Gene 99:285-289 (1991)) disclosed transcriptional regulatory 
elements in the 5' upstream and the first intron regions of 
the human smooth muscle (aortic type) alpha-actin gene. 

20 Petropoulos, et al. ( J. Virol. 66:3391-3397 (1992)) 

disclosed a comparison of expression of bacterial 
chloramphenicol transferase (CAT) operatively linked to either 
the chicken skeletal muscle alpha actin promoter or the 
cytoplasmic beta-actin promoter. These constructs were 

25 provided in a retroviral vector and used to infect chicken 
eggs . 

Exemplary tissue-specific expression elements for 
the liver include but are not limited to HMG-CoA reductase 
promoter (Luskey, Mol. Cell. Biol. 7 (5) : 1881-1893 (1987)); 

3 0 sterol regulatory element 1 (SRE-1; Smith et al . J. Biol. 

Chem. 265 (4) :2306-2310 (1990); phosphoenol pyruvate carboxy 
kinase (PEPCK) promoter (Eisenberger et al . Mol. Cell Biol. 
12 (3) : 1396-1403 (1992)); human C-reactive protein (CRP) 
promoter (Li et al. J. Biol. Chem. 265 (7) :4136-4142 (1990)); 

3 5 human glucokinase promoter (Tanizawa et al . Mol . Endocrinology 
6(7):1070-81 (1992); cholesterol 7-alpha hydroylase (CYP-7) 
promoter (Lee et al. J. Biol. Chem. 269 (20) : 14681-9 (1994)); 
beta-galactosidase alpha-2,6 sialyltransf erase promoter 
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(Svensson et al . J. Biol. Chem. 265 (34) :20863-8 (1990); 
insulin- like growth factor binding protein (IGFBP-1) promoter . 
(Babajko et al . Biochem Biophys. Res. Comm. 196 (1) :480-6 

(1993) ); aldolase B promoter (Bingle et al . Bjochem J, 

5 294 (Pt2) :473-9 (1993)); human transferrin promoter (Mendelzon 
et al. Nucl. Acids Res. 18 (19) : 5717-21 (1990); collagen type I 
promoter (Houglum et al . J. Clin. Invest . 94(2):808-14 

(1994) ) . 

Exemplary tissue-specific expression elements for 

10 the prostate include but are not limited to the prostatic acid 
phosphatase (PAP) promoter (Banas et al . Biochim. Biophys . 
Acta. 1217 (2) :188-94 (1994); prostatic secretory protein of 94 
(PSP 94) promoter (Nolet et al . Biochim. Biophvs. ACTA 
1098 (2) :247-9 (1991)); prostate specific antigen complex 

15 promoter (Casper et al . J. Steroid Bioc hem. Mol. Biol. 47 (1- 
6):127-35 (1993)); human glandular kallikrein gene promoter 
(hgt-1) (Lilja et al . World J. Urology 11(4):188-91 (1993). 

Exemplary tissue-specific expression elements for 
gastric tissue include but are not limited to the human H + /K + - 

20 ATPase alpha subunit promoter (Tanura et al . FEES Letters 
298: (2-3) :137-41 (1992) ) . 

Exemplary tissue-specific expression elements for 
the pancreas include but are not limited to pancreatitis 
associated protein promoter (PAP) (Dusetti et al. J, Biol, 

25 Chem. 268 (19) :14470-5 (1993)); elastase 1 transcriptional 
enhancer (Kruse et al. Genes an d Development 7(5):774-86 
(1993)); pancreas specific amylase and elastase enhancer 
promoter (Wu et al . Mol. Cell. Biol: 11 (9) :4423-30 (1991); 
Keller et al . Genes & Dev. 4(8):1316-21 (1990)); pancreatic 

30 cholesterol esterase gene promoter (Fontaine et al. 
Biochemistry 30 (28) : 7008-14 (1991)). 

Exemplary tissue-specific expression elements for 
the endometrium include but are not limited to the uteroglobin 
promoter (Helftenbein et al. Annal . NY Acad. Sci. 622:69-79 

35 (1991)). 

Exemplary tissue-specific expression elements for 
adrenal cells include but are not limited to cholesterol side- 
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chain cleavage (SCO promoter (Rice et al. J. Biol. Chem. 
265:11713-20 (1990) . 

Exemplary tissue-specific expression elements for 
the general nervous system include but are not limited to 
gamma-gamma enolase (neuron-specific enolase, NSE) promoter 
(Forss-Petter et al . Neuron 5(2):187-97 (1990)). 

Exemplary tissue-specific expression elements for 
the brain include but are not limited to the neurofilament 
heavy chain (NF-H) promoter (Schwartz et al. J, Biol. Chein. 
269(18) :13444-50 (1994) ) . 

Exemplary tissue-specific expression elements for 
lymphocytes include but are not limited to the human CGL- 
1/granzyme B promoter (Hanson et al. J. Biol. Chem. 266 
(36):24433-8 (1991)); the terminal deoxy transferase (TdT) , 
lambda 5, VpreB, and lck (lymphocyte specific tyrosine protein 
kinase p561ck) promoter (Lo et al . Mol . Cell. Biol. 
11 (10) : 5229-43 (1991)); the humans CD2 promoter and its 
3 'transcriptional enhancer (Lake et al. EMBQ J, 9 (10) :3129-36 
(1990)), and the human NK and T cell specific activation 
(NKG5) promoter (Houchins et al . Immunoaenetics 37(2):102-7 
(1993) ) . 

Exemplary tissue-specific expression elements for 
the colon include but are not limited to pp60c-src tyrosine 
kinase promoter (Talamonti et al . J. Clin. Invest 91(l):53-60 
(1993)); organ-specific neoantigens (OSNs) , mw 40kDa (p40) 
promoter (Ilantzis et al. Microbiol . Immunol . 37(2):119-28 
(1993)); colon specific antigen-P promoter (Sharkey et al. 
73(3 supp.) 864-77 (1994)). 

Exemplary tissue-specific expression elements for 
breast cells include but are not limited to the human alpha - 
lactalbumin promoter (Thean et al. British J. Cancer. 
61(5) :773-5 (1990) ) . 

Other elements aiding specificity of expression in a 
tissue of interest can include secretion leader sequences, 
enhancers, nuclear localization signals, endosmolytic 
peptides, etc. Preferably, these elements are derived from 
the tissue of interest to aid specificity. 
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Techniques for nucleic acid manipulation of the 
nucleic acid sequences of the invention such as subcloning 
nucleic acid sequences encoding polypeptides into expression 
vectors, labelling probes, DNA hybridization, and the like are 
described generally in Sambrook et al . , Molecular Cloning - A 
Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York, (1989), which is 
incorporated herein by reference. This manual is hereinafter 
referred to as "Sambrook et al . " 

Once DNA encoding a sequence of interest is isolated 
and cloned, one can express the encoded proteins in a variety 
of recombinantly engineered cells. It is expected that those 
of skill in the art are knowledgeable in the numerous 
expression systems available for expression of DNA encoding. 
No attempt to describe in detail the various methods known for 
the expression of proteins in prokaryotes or eukaryotes is 
made here. 

In brief summary, the expression of natural or 
synthetic nucleic acids encoding a sequence of interest will 
typically be achieved by operably linking the DNA or cDNA to a 
promoter (which is either constitutive or inducible) , followed 
by incorporation into an expression vector. The vectors can 
be suitable for replication and integration in either 
prokaryotes or eukaryotes. Typical expression vectors contain 
transcription and translation terminators, initiation 
sequences, and promoters useful for regulation of the 
expression of polynucleotide sequence of interest. To obtain 
high level expression of a cloned gene, it is desirable to 
construct expression plasmids which contain, at the minimum, a 
strong promoter to direct transcription, a ribosome binding 
site for translational initiation, and a 

transcription/translation terminator. The expression vectors 
may also comprise generic expression cassettes containing at 
least one independent terminator sequence, sequences 
permitting replication of the plasmid in both eukaryotes and 
prokaryotes, i.e., shuttle vectors, and selection markers for 
both prokaryotic and eukaryotic systems. See Sambrook et al . 
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The E2F-RB fusion constructs of the invention can be 
introduced into the tissue of interest in vivo or ex vivo by a 
variety of methods. In some embodiments of the invention, the 
nucleic acid, preferably DNA, is introduced to cells by such 
5 methods as microinjection, calcium phosphate precipitation, 
liposome fusion, or biolistics. In further embodiments, the 
DNA is taken up directly by the tissue of interest. In other 
embodiments, the constructs are packaged into a viral vector 
system to facilitate introduction into cells. 

10 Viral vector systems useful in the practice of the 

instant invention include adenovirus, herpesvirus, adeno- 
associated virus, minute virus of mice (MVM) , HIV, sindbis 
virus, and retroviruses such as Rous sarcoma virus, and MoMLV. 
Typically, the constructs of the instant invention are 

15 inserted into such vectors to allow packaging of the E2F-RB 
expression construct, typically with accompanying viral DNA, 
infection of a sensitive host cell, and expression of the E2F- 
RB gene. A particularly advantageous vector is the adenovirus 
vector disclosed in Wills, et al. Human Gene Therapy 5:1079- 

20 1088 (1994) . 

In still other embodiments of the invention, the 
recombinant DNA constructs of the invention are conjugated to 
a cell receptor ligand for facilitated uptake (e.g., 
invagination of coated pits and internalization of the 

2 5 endosome) through a DNA linking moiety (Wu, et al- J. Biol * 

Chem. 263:14621-14624 (1988); WO 92/06180). For example, the 
DNA constructs of the invention can be linked through a 
polylysine moiety to asialo-oromucocid, which is a ligand for 
the asialoglycoprotein receptor of hepatocytes. 

3 0 Similarly, viral envelopes used for packaging the 

constructs of the invention can be modified by the addition of 
receptor ligands or antibodies specific for a receptor to 
permit receptor-mediated endocytosis into specific cells 
(e.g., WO 93/20221, WO 93/14188; WO 94/06923). In some 
35 embodiments of the invention, the DNA constructs of the 

invention are linked to viral proteins, such as adenovirus 
particles, to facilitate endocytosis (Curiel, et al. Proc, 
MaM Acarl- fin. U.S.A. 88:8850-8854 (1991)). In other 
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embodiments, molecular conjugates of the instant invention can 
include microtubule inhibitors (WO 94/06922) ; synthetic 
peptides mimicking influenza virus hemagglutinin (Plank, et 
al. iT. Riol . Chem. 269:12918-12924 (1994)); and nuclear 
5 localization signals such as SV40 T antigen (WO 93/19768) . 

In some embodiments of the invention, the RB 
polypeptides of the invention are administered directly to a 
patient in need of treatment. A "therapeutically effective" 
dose is a dose of polypeptide sufficient to prevent or reduce 
10 severity of a hyperprolif erative disorder. As used herein, 
the term "hyperprolif erative cells" includes but is not 
limited to cells having the capacity for autonomous growth, 
i.e., existing and reproducing independently of normal 
regulatory mechanisms. Hyperprolif erative diseases may be 
15 categorized as pathologic, i.e., deviating from normal cells, 
characterizing for constituting disease, or may be categorized 
as non-pathologic, i.e., deviation from normal but not 
associated with a disease state. Pathologic 
hyperprolif erative cells are characteristic of the following 
20 disease states: restenosis, diabetic retinopathy, thyroid 
hyperplasia, Grave's disease, psoriasis, benign prostatic 
hypertrophy, Li-Fraumeni syndrome including breast cancer, 
sarcomas and other neoplasms, bladder cancer, colon cancer, 
lung cancer, various leukemias and lymphomas. Examples of 
25 non-pathological hyperprolif erative cells are found, for 
instance, in mammary ductal epithelial cells during 
development of lactation and also in cells associated with 
wound repair. Pathological hyperprolif erative cells 
characteristically exhibit loss of contact inhibition and a 
30 decline in their ability to selectively adhere which implies a 
further breakdown in intercellular communication. These 
changes include stimulation to divide and the ability to 
secrete proteolytic enzymes. 

The constructs of the invention are useful in the 
3 5 therapy of various cancers and other conditions in which the 
administration of RB is advantageous, including but not 
limited to peripheral vascular diseases and diabetic 
retinopathy. Although any tissue can be targeted for which 
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some tissue-specific expression element, such as a promoter, 
can be identified, of particular interest is the tissue- 
specific administration of an RB construct for 
hyperprolif erative disorders such as restenosis, for which the 
5 smooth muscle actin promoter is preferable. 

The compositions of the invention will be formulated 
for administration by manners known in the art acceptable for 
administration to a mammalian subject, preferably a human. In 
some embodiments of the invention, the compositions of the 

10 invention can be administered directly into a tissue by 
injection or into a blood vessel supplying the tissue of 
interest. In further embodiments of the invention the 
compositions of the invention are administered 
" locoregionally" , i.e. , intravesically , intralesionally , 

15 and/or topically. In other embodiments of the invention, the 
compositions of the invention are administered systemically by 
injection, inhalation, suppository, transdermal delivery, etc. 
In further embodiments of the invention, the compositions are 
administered through catheters or other devices to allow 

20 access to a remote tissue of interest, such as an internal 
organ. The compositions of the invention can also be 
administered in depot type devices, implants, or encapsulated 
formulations to allow slow or sustained release of the 
compositions . 

25 The invention provides compositions for 

administration which comprise a solution of the compositions 
of the invention dissolved or suspended in an acceptable 
carrier, preferably an aqueous carrier. A variety of aqueous 
carriers may be used, e.g., water, buffered water, 0.8% 

30 saline, 0.3% glycine, hyaluronic acid and the like. These 
compositions may be sterilized by conventional, well known 
sterilization techniques, or may be sterile filtered. The 
resulting aqueous solutions may be packaged for use as is, or 
lyophilized, the lyophilized preparation being combined with a 

35 sterile solution prior to administration. The compositions 
may contain pharmaceutical^ acceptable auxiliary substances 
as required to approximate physiological conditions, such as 
pH adjusting and buffering agents, tonicity adjusting agents, 
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wetting agents and the like, for example, sodium acetate, 
sodium lactate, sodium chloride, potassium chloride, calcium 
chloride, sorbitan monolaurate, triethanolamine oleate, etc. 
The concentration of the compositions of the 
5 invention in the pharmaceutical formulations can vary widely, 
i.e., from less than about 0.1%, usually at or at least about 
. 2% to as much as 20% to 50% or more by weight, and will be 
selected primarily by fluid volumes, viscosities, etc., in 
accordance with the particular mode of administration 
10 selected. 

The compositions of the invention may also be 
administered via liposomes. Liposomes include emulsions, 
foams, micelles, insoluble monolayers, liquid crystals, 
phospholipid dispersions, lamellar layers and the like. In 

15 these preparations the composition of the invention to be 

delivered is incorporated as part of a liposome, alone or in 
conjunction with a molecule which binds to a desired target, 
such as antibody, or with other therapeutic or immunogenic 
compositions. Thus, liposomes either filled or decorated with 

20 a desired composition of the invention of the invention can 
delivered systemically , or can be directed to a tissue of 
interest, where the liposomes then deliver the selected 
therapeutic/immunogenic peptide compositions. 

Liposomes for use in the invention are formed from 

25 standard vesicle-forming lipids, which generally include 

neutral and negatively charged phospholipids and a sterol, 
such as cholesterol. The selection of lipids is generally 
guided by consideration of, e.g., liposome size, acid lability 
and stability of the liposomes in the blood stream. A variety 

30 of methods are available for preparing liposomes, as described 
in, e.g., Szoka et al . Ann. Rev. Biophvs. Bioeng. 9:467 
(1980), U.S. Patent Nos. 4,235,871, 4,501,728, 4,837,028, and 
5,019,369, incorporated herein by reference. 

A liposome suspension containing a composition of 

35 the invention may be administered intravenously, locally, 
topically, etc. in a dose which varies according to, inter 
alia , the manner of administration, the composition of the 
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invention being delivered, and the stage of the disease being 
treated. 

For solid compositions, conventional nontoxic solid 
carriers may be used which include, for example, 
5 pharmaceutical grades of mannitol, lactose, starch, magnesium 
stearate, sodium saccharin, talcum, cellulose, glucose, 
sucrose, magnesium carbonate, and the like. For oral 
administration, a pharmaceutical^ acceptable nontoxic 
composition is formed by incorporating any of the normally 
10 employed excipients, such as those carriers previously listed, 
and generally 10-95% of active ingredient, that is, one or 
more compositions of the invention of the invention, and more 
preferably at a concentration of 25%~75%. 

For aerosol administration, the compositions of the 
15 invention are preferably supplied in finely divided form along 
with a surfactant and propellant. Typical percentages of 
compositions of the invention are 0.01%-20% by weight, 
preferably 1%-10%. The surfactant must, of course, be 
nontoxic, and preferably soluble in the propellant. 
20 Representative of such agents are the esters or partial esters 
of fatty acids containing from 6 to 22 carbon atoms, such as 
caproic, octanoic, lauric, palmitic, stearic, linoleic, 
linolenic, olesteric and oleic acids with an aliphatic 
polyhydric alcohol or its cyclic anhydride. Mixed esters, 
25 such as mixed or natural glycerides may be employed. The 
surfactant may constitute 0.1%-20% by weight of the 
composition, preferably 0.25-5%. The balance of the 
composition is ordinarily propellant. A carrier can also be 
included, as desired, as with, e.g., lecithin for intranasal 
30 delivery. 

The constructs of the invention can additionally be 
delivered in a depot-type system, an encapsulated form, or an 
implant by techniques well-known in the art. Similarly, the 
constructs can be delivered via a pump to a tissue of 
35 interest. 

In some embodiments of the invention, the 
compositions of the invention are administered ex vivo to 
cells or tissues explanted from a patient, then returned to 

SUBSTITUTE SHEET (RULE 26) 

OOCID: <WO 982122aA1JA> 



WO 98/21228 PCT/US97/21821 

19 

the patient. Examples of ex vivo administration of gene 
therapy constructs include Arteaga et al . Cancer Research 
56(5) :1098-1103 (1996); Nolta et al . Proc Natl. Acad. Sci , USA 
93(6):2414-9 (1996); Koc et al . Seminars in Oncology 23 
5 (l):46-65 (1996); Raper et al . Annals of Surgery 223 (2) :116-26 
(1996); Dalesandro et al. J. Thorac . Cardi . Surg. ll(2):416-22 
(1996); and Makarov et al . Proc. Natl. Acad. Sci. USA 
93 (1) :402-6 (1996) . 

In some embodiments of the invention, the constructs 

10 of the invention are administered to a cardiac artery after 
balloon angioplasty to prevent or reduce the severity of 
restenosis. The constructs of the invention can be used to 
coat the device used' for angioplasty (see, for example, 
Willart, et al . Circulation 89:2190-2197 (1994); French, et 

15 al. Circulation 90:2402-2413 (1995)). In further embodiments, 
the fusion polypeptides of the invention can be used in the 
same manner. 

The following examples are included for illustrative 
purposes and should not be considered to limit the present 
20 invention. 

Example I 

E2F-RB F usions 

25 A. Introduction 

In this example, expression plasmids which encode 
different segments of E2F fused to RB56 polypeptide were 
constructed. RB56 is a subf ragment of full length RB which 
contains the "pocket" domains necessary for growth suppression 

30 (Hiebert, et al . M£B 13:3384-3391 (1993); Qin, et al. Cenes 
and Dev. 6:953-964 (1992)). E2F194 contains E2F amino acids 
95-194. This fragment contains only the DNA binding domain of 
E2F. E2F286 contains the DNA binding domain and the DP-1 
heterodimerization domain. Both E2F fragments lack the N- 

35 terminal cyclin A-kinase binding domain, which appears to 
down-regulate the DNA binding activity of E2F (Krek et al . 
Cell 83:1149-1158 (1995); Krek et al . Cell 78:161-172 (1994)). 
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B. Construction of Vectors 

Plasmid pCTM contains a CMV promoter, a tripartite 
adenovirus leader flanked by T7 and SP6 promoters, and a 
multiple cloning site with a bovine growth hormone (BGH) 
polyadenylation site and a SV-40 poly adenylation site 
downstream. A diagrammatic representation of pCTM is provided 
in Figure 3 . The DNA sequence for pCTM is provided in Figure 
4 . 

pCTMI was constructed from pCTM by digesting pCTM 
with Xho I and Not I and subcloning a 180 bp intron Xhol-Not I 
fragment from a pCMV-P-gal vector (Clonetech ) . A 
diagrammatic representation of pCTMI is provided in Figure 5. 
The DNA sequence is provided in Figure 6. 

pCTMIE was constructed by amplifying the SV40 
enhancer from SV40 viral DNA in a polymerase chain reaction. 
The amplified product was digested with Bglll and inserted 
into BamHl -digested pCMTI and ligated in the presence of 
BamHI. The plasmid is depicted diagrammatically in Figure 7. 
The DNA sequence is provided in Figure 8 . 

pCTM-RB was prepared as follows. A 3.2 KB Xba I - 
Cla I fragment of pETRBc (Huang et al . Nature 350:160-162 
(1991)) containing the full length human RB cDNA was ligated 
to Xba I -Cla I digested pCTM. pCTM-RB56 was prepared by 
ligating the digested pCTM to a 1.7 KB Xba 1 -Cla I fragment 
containing the coding sequence for RB56 . pCTMI-RB, pCTMIE-RB , 
pCTMI-RB56 (amino acids 381-928) and pCTMIE-RB56 (amino acids 
3 81-928) were all constructed by the same methods. 

C. RB-E2F fu sion Constructs 

Figure 9 depicts the fusion constructs used in these 
studies. These E2F constructs commenced at amino acid 95 and 
lacked part of the cyclin A binding domain. E2F437 contained 
the DNA binding domain (black) , heterodimerization domain 
(white) and transact ivat ion domain (stippled) . E2F194 
contained solely the DNA binding domain. E2F286 contained the 
DNA binding domain and DP-1 heterodimerization domain. RB56- 
5s refers to an RB variant having alanine substitutions at 
amino acid residues 606, 612, 788, 807 and 811. In E2F194- 
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RB56-5s and E2F286-RB56-5s, the E2F fragments were fused in 
frame to codon 379 of RB-5s. RB56-C706F contained an 
inactivating point mutation (Kaye et al . Proc , Natl. Acad, 
Sci. U.S.A. 87:6922-6926 (1990)). 
5 pCMV-E2F194 and pCMV-E2F437 were constructed as 

follows. DNA encoding amino acids 95-194 of E2F (containing 
the DNA binding domain) or amino acids 95-437 was amplified in 
a polymerase chain reaction, digested with Hindi I, and ligated 
into Smal/Hindll digested pCMV-RB56 vectors. pCMVE2F286 was 

10 constructed by digesting pCMV-E2F437 with Aflll, treating the 
ends with DNA pol I (Klenow fragment) and religating in the 
presence of Aflll. The blunt end ligation created a stop 
codon at position 287. pCMV-E2F286-5s was constructed by 
ligating Aflll (blunt ) /Hindlll digested pE2F437 to a Sal I 

15 (blunt ) -Hindlll fragment containing the RB56-5s coding 
sequence. pCTMIE-E2F194-5s and pCTMIE-E2F286 -RB5s were 
constructed by ligating EcoRI-EcoRV digested pCTMIE (4.2 KB) 
to Hindlll (blunt) -EcoRI fragments from either pCMV-E2F194- 
RB5s or pCMV-E2F286-RB5s. 

20 

D. Promoter Repression 

To measure the effect of the E2F-RB fusion proteins, 
cervical carcinoma cell line C33A (ATCC # HTB-31) was 
transfected with equivalent amounts of E2F194-RB56 or E2F RB56 

25 with an E2-CAT reporter plasmid (See, e.g., Weintraub et al. 
HatilTS 358:259-261 (1992)). 

In the C33A assay, 250,000 C33A cells were seeded 
into each of well of 6 -well tissue culture plates and allowed 
to adhere overnight. 5 fig each of pCMV-RB56, pCMV-E2F RB56, 

30 or pCMV-E2F plasmid were cotransf ected (calcium phosphate 
method, MBS transfection kit, Stratagene) with 5 fxg of 
indicated reporter construct E2-CAT or SVCAT) and 2.5 P~gal 
plasmid (pCMV-P, Clontech) per well into duplicate wells. 
Cells were harvested 72 hour after transfection and extracts 

35 were prepared. 

In the 5637 assay, 250,000 5637 cells were seeded as 
described above. 1 /zg each of RB or E2F-RB fusion plasmid, 
E2-CAT or SV-CAT reporter plasmid and pCMV-p-galactosidase 
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were cotransf ected using the lipofectin reagent (BRL, 
Bethesda, Maryland) according to the manufacturer's 
instructions . 

CAT assays were performed using either 20 fxL (C33A) 
5 or 50 //L (5637) of cell extract (Gorman et al. Mol. Cell. 

Biol. 2:1044 (1982)). TLCs were analyzed on a Phosphoimager 
SF (Molecular Dynamics) . CAT activities were normalized for 
transfection efficiency according to P-galactosidase 
activities of each extract. p-galactosidase activities of 

10 extracts were assayed as described by Rosenthal et al. (Metfr . 
Enzym. 152:704 (1987)). 

The results of these studies were as follows. 
Transfection of the E2-CAT reporter alone or in the presence 
of the nonfunctional control RB56-H209 mutant yielded 

15 relatively high CAT activity. Cotransf ection of wild- type 
RB56 or the variant RB56-5s resulted in a 10 to 12 fold 
repression of CAT activity, indicating that RB56 or RB56-5S 
are both capable of efficiently repressing E2F-dependent 
transcription. E2F194-RB5s and E2F286-RB5S repressed 

20 transcription approximately 50 fold. Transcriptional 

repression required both the RB56 and the E2F components of 
the fusion proteins, as expression of E2F194 and E2F286 did 
not mediate transcriptional repression. No repression of 
SV4 0-CAT transcription occurred with E2F-RB constructs, thus 

25 demonstrating the specificity of the transcriptional 

repression by E2FRB for the E2 promoter. These results are 
depicted diagrammatically in Figure 10. 

e. Cell cycle arrest 

30 The ability of E2F-RB fusion polypeptides to cause 

Gl arrest in Saos-2 (RB-/- cells) (ATCC # HTB-85) and C33A 
cells was investigated. Previous studies have shown that RB- 
mediated E2 promoter repression and Gl arrest are linked in 
Saos-2 cells but dissociated in C33A (RBmut) cells (Xu, et al. 

35 PNAS 92:1357-1361 (1992)) . Cells were washed in PBS and were 
fixed in 1 mL -20°C 70% ethanol for 30 minutes. Cells were 
collected by centrif ugation and resuspended in 0.5 mL 2% serum 
containing 10 jug/ml RNase A and incubated for 3 0 minutes at 
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37°C 0.5 mL of PBS containing propidium iodide (100 /^g/ml) was 
added to each sample, mixed and cells were filtered through a 
FACS tube capstrainer. FACS analysis was performed on a FACS- 
Scan (Becton-Dickenson) using doublet discrimination. 5,000- 
5 10,000 CD20 + events were analyzed. Percent of cells in G 0 /G lt 
S, and G 2 /M was determined using Modfit modeling software. 

The results of this experiment were as follows. 
Both full length RB110 and the truncated version RB56, but not 
the control mutant RB-H209, caused G x arrest in Saos-2 cells 

10 (Table 1) . Similarly, the RB56-5s, E2F-194 -RB56-5S and 

E2F286-RB56-5s all were capable of arresting cells in G 0 /Gi. 
Transfection of the DNA binding domain, E2F194, did not block 
S-phase entry in Saos-2 as previously described for rodent 
cells (Dobrowolski, et al . Oncogene 9:2605-2612 (1994)). In 

15 contrast, RB110, RB56, and E2F-RB fusion proteins were not 
capable of arresting C33A cell lines indicating that the 
transcriptional repression observed in these cells does not 
translate into G : arrest. 

The ability of the E2F-RB fusion proteins to arrest 

20 5637 cells was also investigated (Table 2) . RB56 and RB56-5s 
both efficiently arrested cells in G 0 /G l (approximately 90% of 
cells in Go-Gj , whereas E2F194 -RB56 -5s and E2F286-RB56-5s are 
slightly less efficient (about 80% of cells in G Q /G X ) at 
promoting G 0 /G x arrest. Without being limited to any one 

25 theory, the less efficient arrest of both Saos-2 and 5637 

cells by the E2F-RB fusion proteins appears due to the lower 
levels of steady- state protein produced in these cells (Figure 
11, panels b and c) . 

30 

Table 1: Cell Cvcl e Regulation bv RB and E2F-RB fusion proteins in RBneg cells 



% Cells 




CD2<T 

G 0 /G, 


G,/M 


S-phase 


H209 


52.1 


27.1 


20.8 


P56RB 


78.8 


14.2 


7.0 


DllORB 


70.9 


14.3 


14.8 
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p56RB-5s 


84.8 


13.2 


2.0 


p56RB-p5 


81.3 


11.5 


7.3 


E2F-194-5s 


77.8 


14.9 


7.3 


E2F-286-5s 


72.2 


15.0 


12.8 


E2F-194 


49.9 


28.0 


22.1 



10 

Tahle 2: Growth Sunnression of 5637 Bladder Cells bv RB and E2F-RR fusion Proteins 





5637/CD20+ 


% Cells 


15 




G n /G, 


S 


G,M 




CD20 


59.7 


16.9 


20.6 




RB56-C706F 


57.4 


16.3 


24.3 




RB56WT 


90.7 


4.12 


4.88 




RB56-5s 


89.91 


3.51 


6.1 


20 


E2F1 94-5s 


80.1 


1.31 


0 




E2F-286-5S 


79.21 


8.1 


0 




P. Activitv of 


Fusion Proteins 


in Functional RB Backaround 



25 The activity of the E2F-RB fusion proteins in a 

cellular background containing functional RB was then 
determined. NIH-3T3 cells were transfected with RB56 or E2F- 
RB56 fusions and stained with anti-RB monoclonal antibody 3C8 
(Wen et al . »T. Tmmuno. Meth. 169:231-240 (1994)). FACS 



3 0 analysis was performed of the RB expressing cells. The 

results are shown in Figure 12. The non-gated population (g) 
shows the characteristic cell cycle distribution for NIH-3T3 
cells (60% GO, 28% S, 10% G2/M) . In contrast, in cells 
transfected with RB56 (a,b) or E2F-RB fusion proteins (c-f ) , 

35 greater than 90% of the RB-expressing cells were arrested in 
G 0 /3i. These data demonstrate that the ability of RB and E2F- 
RB56 fusions to arrest cells in G Q /G l is not limited to RB 
negative tumor cells. The relative levels of protein 
expressed in transfected NIH-3T3 cells was also investigated. 

40 RB110 was not expressed efficiently in these cells. 
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Thus, these data demonstrate that E2F-RB fusion 
proteins are more efficient transcriptional repressors than 
either pRB or RB56 alone, and that RB can repress 
transcription by remaining bound to E2F rather than directly 
5 blocking the transactivation domain of E2F. These data 

support the use of E2F-RB fusions as RB agonists in both RB+ 
cells and in RB negative or RB mutant cells. 

Example II. 

10 Tissue-Specific Expression of E2F-RB Fusions 

A. Construction of Recombin ant Adenovirus:. 

In this experiment, recombinant adenoviruses 
comprising an RB polypeptide under the control of a CMV or 

15 smooth muscle alpha act in promoter were generated. 

The smooth muscle a-actin promoter (bases -670 
through +5, Reddy et al . "Structure of the Human Smooth Muscle 
tt-Actin Gene." J. Biol. Chem. 265:1683-1687 (1990), Nakano, 
et al. "Transcriptional Regulatory Elements In The 5' Upstream 

20 and First Intron Regions of The Human Smooth Muscle (aortic 
type) a-Actin-Encoding Gene." Gene 99:285-289 (1991) was 
isolated by PGR from a genomic library with 5' Xho I and Avr 
II and 3' Xba I, Cla I and Hind III restriction sites added 
for cloning purposes. The fragment was subcloned as an Xho 1, 

25 Hind III fragment into a plasmid for sequencing to verify base 
composition. A fusion construct 286-56 containing the DNA and 
heterodimerization domain of E2F-1 (bases 95-286) linked to 
p56 (amino acids 379-928 of full length RB) was subcloned as 
an Xba I, Cla I fragment directly downstream of the smooth 

30 muscle a-actin promoter, and this expression cassette was 

digested out and cloned into the plasmid pAd/ITR/IX- as an Xba 
I to Avrll, and Cla I fragment to create the plasmid pASN286- 
56. This plasmid consisted of the adenovirus type 5 inverted 
terminal repeat (ITR) , packaging signals and Ela enhancer, 

35 followed by the human smooth muscle a-actin promoter and 286- 
56 cassette, and then Ad 2 sequence 4021-10462 (which contains 
the Elb/protein IX poly A signal) in a pBR322 background. 
Recombinant adenovirus was produced by standard procedures. 
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The plasmid pASN286-56 was linearized with Ngo MI and co- 
transfected into 293 cells with the large fragment of Cla I 
digested rAd34 which has deletions in both the E3 and E4 
regions of adenovirus type 5. Ad34 was a serotype 5 derivative 
5 with a 1.9 KB deletion in early region 3 resulting from 

deletion of the Xba I restriction fragment extending from Ad5 
coordinates 28593 to 30470 and a 1.4 KB deletion of early 
region 4 resulting from a Taq 1 fragment of E4 (coordinates 
33055-35573) being replaced with a cDNA containing E4 ORF 6 

10 and 6/7. 

Recombinant adenovirus produced by homologous 
recombination was isolated and identified by restriction 
digest analysis and further purified by limiting dilution. 
Additional control recombinant adenoviruses are described 

15 elsewhere and include the control virus ACN (CMV promoter, 
Wills, et al. "Gene Therapy For Hepatocellular Carcinoma: 
Chemosensitivity Conferred By Adenovirus -Mediated Transfer of 
The HSV-1 Thymidine Kinase Gene." Cancer Ge ne Therapy 2:191- 
197 (1995)), and ACN56 (RB expressed under control of a CMV 

20 promoter) . 

ACN56 was prepared as follows. A plasmid containing p56 
cDNA was constructed by replacing the p53 cDNA from the 
plasmid ACNP53 (Wills et al . Human Gene Therapy 5:1079-1088 

(1994) ) with a 1.7 KB Xba I- BamHI fragment isolated from 
25 plasmid pET 9a-Rb56 (Antelman et al . Oncogene 10:697-704 

(1995) ) which contains p56 cDNA. The resulting plasmid 
contained amino acids 381-928 of p56, the Ad5 inverted 
terminal repeat, viral packaging signals and Ela enhancer, 
followed by the human cytomegalovirus immediate early promoter 

30 (CMV) and Ad 2 tripartite leader cDNA to drive p56 expression. 
The p56 cDNA was followed by Ad 2 sequence 4021-10462 in a 
pBR322 background. This plasmid was linearized with EcoRI 
and cotransfected with the large fragment of bsp 106 digested 
DL327 (E3 deleted; Thimmappaaya et al. Cell 31:543-551 (1982)) 

35 or h5ile4 (E4 deleted; Hemstrom et al. J. Virol. 62:3258-3264 
(1988)). Recombinant viruses were further purified by 
limiting dilution. 
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b. Cellular proliferation 

In this experiment, cell lines were infected in 
culture with recombinant adenovirus RB constructs to ascertain 
the relative expression of the RB polypeptide and the effect 
5 on cell proliferation. 

For H358 (ATCC # Crl 5807) and MDA-MB468 (ATCC # HTB 
132, breast adenocarcinoma) cells, 5,000 cell/well were plated 
in normal growth media in a 96 well microtiter plate (Costar) 
and allowed to incubate overnight at 37°C, 7% C0 2 . Viruses 

10 were serially diluted in growth media and used to infect cells 
at the indicated doses for 48 hours. At this point, 3 H- 
thymidine was added (Amersham, 0.5 ^Ci/well) and the cells 
were incubated at 37°C for another 3 hours prior to harvest. 
Both A7r5 (ATCC CRL1444, rat smooth muscle) and A10 (ATCC CRL 

15 1476, rat smooth muscle) cells were seeded at 3,000 cells/well 
in either DME + 0.5% FCS or DME + 20% FCS respectively. Virus 
was serially diluted in the seeding media and used to infect 
the cells at the doses indicated in the Figures. The 
infection and labelling procedure were the same for A10 cells 

20 as with the H358 and MDA-MB468 cells except that 2 /zCi/well of 
label was used. The A7r5 cells were not infected with virus 
until 4 8 hours after seeding. Forty eight hours after 
infection, the serum concentration was raised to 10% FCS and 2 
/zCi/well of 3 H-thymidine was added and incubation continued 

25 for an additional 3 hours prior to harvest. All cells were 
harvested by aspirating media from the wells, trypsinization 
of the cells, and harvesting using a 96 well GF/C filter with 
a Packard Top count cell harvester.' Results are plotted as 
the mean percentage (+/- SD) of media treated control 

30 proliferation versus dose of virus in Figures 13 and 14. 

Thus, Figure 13 depicts a comparison of the effects 
of adenovirus p56 constructs on muscle cells A10 and A7R5 
cells. The CMV-driven p56 (ACN 56) virus inhibited A10 
growth to approximately the same extent as the actin promoter- 

35 driven E2F-fusion constructs (ASN586-56 #25,26). In Figure 
14, the effects of adenovirus constructs on inhibition of a 
breast cancer cell line, MDA MP468 and a non- small cell lung 
carcinoma cell line, H358, are depicted. In these 
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experiments, actin promoter -driven E2F-p56 was ineffective, 
while the CMV promoter-driven p56 was effective in inhibiting . 
growth of non-smooth muscle cells. 

To determine whether the non-smooth muscle cells 
5 were more infectable with adenovirus than the smooth muscle 

cell lines used, the four cells lines, H358, MB468, A7R5 , and 
A10 were infected at an MOI of 5 with an adenovirus expressing 
p-galactosidase (ACPGL; Wills, et al. Human Gene Therapy 
5:1079-1088 (1994)) and degree of P-gal staining was examined. 
10 As shown in Figure 15 (top) , the non-smooth muscle cell lines 
were significantly more infectable than the smooth muscle cell 
lines. In a further test, cells were infected at higher 
multiplicities of infection (50, 100, 250, 500) with ACN56 and 
the amount of p56 present in the infected cells detected by 
15 autoradiography. As can be seen in Figure 15 (bottom) , the 
non-muscle cell lines had significantly more p56 present, 
since as a result of their greater infectivity, infected cells 
have a greater viral load and thus more copies of the p56 
template driven by the non- tissue specific CMV promoter. 
20 In a further experiment, the specificity of the 

actin smooth muscle promoter for smooth muscle tissue was 
ascertained. In this experiment, P-gal expression levels in 
cells infected with P-gal constructs driven with different 
promoters were measured. As can be seen in Figure 19, despite 
25 the lower infectivity of the smooth muscle cells, expression 
was only evident in these cells using the smooth muscle alpha 
actin promoter. 

Figure 21 depicts a comparison of the effects of a 
CMV driven p56 recombinant adenovirus (ACN56E4) vs a human 
30 smooth muscle alpha-actin promoter driven E2F-p56 fusion 
construct (ASN286-56) vs control adenoviral construct 
containing either the CMV or smooth muscle alpha-actin 
promoters without a downstream transgene (ACNE 3 or ASBE3-2 
isolates shown, respectively) . Assays were 3H- thymidine 
35 uptake either in a smooth muscle cell line (A7R5) or a non- 
muscle cell line (MDA-MB4 6 8 , breast carcinoma) . Results 
demonstrated muscle tissue specificity using the smooth muscle 
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alpha-actin promoter and specific inhibition of both the p56 
and E2F-p56 transgenes relative to their respective controls. 

C. inhibition of Restenosis 
5 The model of balloon injury was based on that 

described by Clowes, et al . (Clowes, T.flfr, Invest. 49:327-333 
(1983)) . Male Sprague-Dawley rats weighing 400-500g were 
anesthetized with an intraperitoneal injection of sodium 
pentobarbital (45 mg/kg. Abbot Laboratories, North Chicago, 

10 Illinois) . The bifurcation of the left common carotid artery 
was exposed through a midline incision and the left common, 
internal, and external carotid arteries were temporarily 
ligated. A 2F embolectomy catheter (Baxter Edwards Healthcare 
Corp., Irvine, CA) was introduced into the external carotid 

15 and advanced to the distal ligation of the common carotid. 
The balloon was inflated with saline and drawn towards the 
arteriotomy site 3 times to produce a distending, 
deendothelializing injury. the catheter was then withdrawn. 
Adenovirus (1 x 10 9 pfu of Ad-RB (ACNRb) or Ad-p56 (ACN56) in 

20 a volume of lOjul diluted to 100^1 with 15% (wt/vol) Poloxamer 
407 (BASF, Parsippany, N.J.) or Ad-P-Gal (1 x 10 9 pfu, diluted 
as above) was injected via a canula, inserted just proximal to 
the carotid bifurcation into a temporarily isolated segment of 
the artery. The adenovirus solution was incubated for 2 0 

25 minutes after which the viral infusion was withdrawn and the 
cannula removed. The proximal external carotid artery was 
then ligated and blood flow was restored to the common carotid 
artery by release of the ligatures. The experimental protocol 
was approved by the Institutional Animal Care and Use 

30 Committee and complied with the "Guide for the Care and Use of 
Laboratory Animals." (NIH Publication No. 86-23, revised 
1985) . 

Rats were sacrificed at 14 days following treatment 
with an intraperitoneal injection of pentobarbital (100 
35 mg/kg.) . The initially balloon injured segment of the left 

common carotid artery, from the proximal edge of the omohyoid 
muscle to the carotid bifurcation, was perfused with saline 
and dissected free of the surrounding tissue. The tissue was 
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fixed in 100% methanol until imbedded in paraffin. Several 4- 
fjm sections were cut from each tissue specimen. One section 
from each specimen was stained with hematoxylin and eosin and 
another with Richardson's combination elastic-trichrome stain 
5 conventional light microscopic analysis. 

Histological images of cross sections of hematoxylin 
and eosin or elastic-trichrome stained arterial sections were 
projected onto a digitizing board (Summagraphics) and the 
intimal, medial and luminal areas were measured by 
10 quantitative morphometric analysis using a computerized 

sketching program (MACMEASURE, version 1.9, National Institute 
of Mental Health) . 

Results were expressed as the mean ± S.E.M. 
Differences between groups were analyzed using an unpaired 
15 two-tailed Student's t test. Statistical significance was 
assumed when the probability of a null effect was <0.05. 

Results are shown in Figures 17 and 18. In Figure 
17, the relative inhibition of neointima formation is depicted 
graphically, demonstrating the ability of p56 and RB to 
20 inhibit neointima formation. Figure 18 provides photographic 
evidence of the dramatic reduction of neointima in the 
presence of p56 . 

Adenovirus -treated carotid arteries were harvested 
from rats at 2 days following balloon injury and infections. 
25 Tissue was fixed in phosphate-buffered formalin until embedded 
in paraffin. Tissue was cut into 4/zm cross-sections and 
dewaxed through xylene and graded alcohols. Endogenous 
peroxidase was quenched with 1% hydrogen peroxide for 30 
minutes. Antigen retrieval was performed in lOmM sodium 
30 citrate buffer, pH 6.0 at 95°C for 10 minutes. A monoclonal 
anti-RB antibody (AB-5, Oncogene Sciences, Uniondale, New 
York) was applied 10/zg/ml in PBS in a humid chamber at 4°C for 
24 hours. Secondary antibody was applied from the Unitect 
Mouse Immunohistochemistry Kit (Oncogene Sciences, Uniondale, 
35 New York) according to the manufacturer's instructions. The 

antibody complexes were visualized using 3 , 3 ' -diaminobenzidene 
(DAB, Vector Laboratories, Burlingame, CA) . Slides were thin 
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counterstained with hematoxylin and mounted. The results are 
depicted in Figure 20. 

All references cited herein are hereby incorporated 
5 by reference in their entirety for all purposes. 
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1 1. A polypeptide comprising a fusion of a 

2 transcription factor, the transcription factor comprising a 

3 DNA binding domain, and a retinoblastoma (RB) polypeptide, the 

4 RB polypeptide comprising a growth suppression domain. 

1 2, A nucleic acid encoding the fusion polypeptide 

2 of claim 1. 

1 3. The nucleic acid of claim 2, wherein the 

2 nucleic acid in inserted in an adenovirus vector. 

1 4. The polypeptide of claim 1, wherein the 

2 transcription factor is E2F. 

1 5. The polypeptide of claim 4, wherein the cyclin 

2 A binding domain of the E2F is deleted or nonfunctional. 

1 6. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide is RB56. 

1 7. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide is wild type RB. 

1 8. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide comprises from about amino acid 

3 residue 379 to about amino acid residue 928 of pRB . 

1 9. The polypeptide of claim 1, wherein the 

2 retinoblastoma polypeptide comprises at least one substitution 

3 of amino acid residues selected from the group consisting of 

4 2, 608, 612, 788, 807, and 811 of pRB. 

1 10. The polypeptide of claim 5, wherein the E2F 

2 comprises about amino acid residues 95 to about 286. 
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1 11. The polypeptide of claim 4, wherein the E2F 

comprises about amino acid residues 95 to about 194. 

1 12. The polypeptide of claim 1, wherein the fusion 

2 comprises EF2 amino acid residues from about 95 to about 194 

3 operatively linked to RB amino acid residues from about 379 to 

4 about 928. 

1 13 . An expression vector comprising DNA encoding a 

2 polypeptide, the polypeptide comprising a fusion of a 

3 transcription factor, the transcription factor comprising a 

4 DNA binding domain, and a retinoblastoma (RB) polypeptide, the 

5 RB polypeptide comprising a growth suppression domain. 

1 14. The vector of claim 13, comprising a tissue- 

2 specific promoter operatively linked to DNA encoding the 

3 fusion. 

1 15. The vector of claim 14, wherein the tissue 

2 specific promoter is a smooth muscle actin promoter. 

1 16. A method for treatment of a hyperprolif erative 

2 disorder in a patient comprising administering to a patient a 

3 therapeutically effective dose of a fusion polypeptide 

4 comprising a fusion of a transcription factor, the 

5 transcription factor comprising a DNA binding domain, and a 

6 retinoblastoma (RB) polypeptide, the RB polypeptide comprising 

7 a growth suppression domain. 

1 17. The method of claim 16, wherein the fusion 

2 protein is encoded by a nucleic acid delivered to the patient. 

1 18. The method of claim 16, wherein the 

2 transcription factor is E2F. 

1 19. The method of claim 18, wherein the cyclin A 

2 binding domain of the E2F is deleted or nonfunctional. 
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1 20. The method of claim 16, wherein the RB is RB56. 

1 21. The method of claim 16, wherein the RB is wild 

2 type RB56. 

1 22. The method of claim 16, wherein the RB 

2 comprises from about amino acid residue 379 to about amino 

3 acid residue 928. 

1 23. The method of claim 16, wherein the RB 

2 comprises at least one substitution of amino acid residues 

3 selected from the group consisting of 2, 608, 612, 788, 807, 

4 and 811. 

1 24. The method of claim 18, wherein the E2F 

2 comprises about amino acid residues 95 to about 286. 

1 25. The method of claim 18, wherein the E2F 

2 comprises about amino acid residues 95 to about 194. 

1 26. The method of claim 16, wherein the fusion 

2 comprises EF2 amino acid residues from about 95 to about 194 

3 operatively linked to RB amino acid residues from about 379 to 

4 about 928. 

1 27. The method of claim 18, wherein the E2F -RB 

2 fusion polypeptide is expressed under the control of a tissue- 

3 specific promoter. 

1 28. The method of claim 27, wherein the tissue 

2 specific promoter is a smooth muscle actin promoter. 

1 29. The method of claim 16, wherein the 

2 hyperprolif erative disorder is cancer. 

1 30. The method of claim 29, wherein the cancer is 

2 bladder cancer. 



SUBSTITUTE SHEET (RULE 26) 

30CID: <WO__982122SA1_IA> 



VVO 98/21228 PCT/US97/21821 

35 

1 31- The method of claim 29, wherein the 

2 hyperprolif erative disorder is restenosis. 

1 32. The method of claim 31, wherein the E2F-RB 

2 fusion polypeptide is administered after angioplasty. 

1 33. The method of claim 32, wherein the E2F-RB 

2 fusion polypeptide is administered as a coating on an 

3 angioplasty device. 

1 34. The method of claim 17, wherein the nucleic 

2 acid is administered after angioplasty. 

1 35. The method of claim 17, wherein the nucleic 

2 acid is administered as a coating on an angioplasty device. 

1 36. The method of claim 17, wherein the nucleic 

2 acid is inserted in an adenovirus vector. 
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10 20 30 40 50 60 

MALAGAPAGG PCAPALEALL GAGALRLLDS SQIVIISAAQ DASAPPAPTG PAAPAAGPCD 

70 80 90 100 HO 120 

PDLLLFATPQ APRPTPSAPR PALGRPPVKR RLDLETDHQY LAESSGPARG RGRHPGKGVK 



130 

3PGEKSRYET 



140 150 160 170 180 

3LNLTTKRFL ELLSHSADGV VDLNWAAEVL KVQKRRIYD1 TNVLEGIQLI 



190 200 210 220 230 240 

AKKSKNHIQW LGSHTTVGVG GRLEGLTQDL RQLQESEQQL DHLMNICTTQ LRLLSEDTDS 

250 260 270 280 290 300 

QRLAYVTCQD LRSIADPAEQ MVMVIKAPPE TQLQAVDSSE NFQISLKSKQ GPIDVFLCPE 

310 320 330 340 350 360 

ETVGGISPGK TPSQEVTSEE ENRATDSATI VSPPPSSPPS SLTTDPSQSL LSLEQEPLLS 

370 380 390 400 410 420 

RMGSLRAPVD EDRLSPLVAA DSLLEHVRED FSGLLPEEFI SLSPPHEALD YHFGLEEGEG 



430 440 
IRDLFDCDFG DLTPLDF* . . 



450 



460 



470 



480 
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10 20 30 40 50 60 

GGAATTCCGT GGCCGGGACT TTGCAGGCAG CGGCGGCCGG GGGCGGAGCG GGATCGAGCC 

70 80 90 100 110 120 

CTCGCCGAGG CCTGCCGCCA TGGGCCCGCG CCGCCGCCGC CGCCTGTCAC CCGGGCCGCG 

130 140 150 160 170 180 

CGGGCCGTGA GCGTCATGGC CTTGGCCGGG GCCCCTGCGG GCGGCCCATG CGCGCCGGCG 

190 200 210 220 230 240 

CTGGAGGCCC TGCTCGGGGC CGGCGCGCTG CGGCTGCTCG ACTCCTCGCA GATCGTCATC 

250 260 270 280 290 300 

ATCTCCGCCG CGCAGGACGC CAGCGCCCCG CCGGCTCCCA CCGGCCCCGC GGCGCCCGCC 

310 320 330 340 350 360 

GCCGGCCCCT GCGACCCTGA CCTGCTGCTC TTCGCCACAC CGCAGGCGCC CCGGCCCACA 

370 380 390 400 410 420 

CCCAGTGCGC CGCGGCCCGC GCTCGGCCGC CCGCCGGTGA AGCGGAGGCT GGACCTGGAA 

430 440 450 460 470 480 

ACTGACCATC ' AGTACCTGGC CGAGAGCAGT GGGCCAGCTC GGGGCAGAGG CCGCCATCCA 

490 500 510 520 530 540 

GGAAAAGGTG TGAAATCCCC GGGGGAGAAG TCACGCTATG AGACCTCACT GAATCTGACC 

550 560 570 580 590 600 

ACCAAGCGCT TCCTGGAGCT GCTGAGCCAC TCGGCTGACG GTGTCGTCGA CCTGAACTGG 

610 620 630 640 650 660 

GCTGCCGAGG TGCTGAAGGT GCAGAAGCGG CGCATCTATG ACATCACCAA CGTCCTTGAG 

670 680 690 700 710 720 

GGCATCCAGC TCATTGCCAA GAAGTCCAAG AACCACATCC AGTGGCTGGG CAGCCACACC 

730 740 750 760 770 780 

ACAGTGGGCG TCGGCGGACG GCTTGAGGGG TTGACCCAGG ACCTCCGACA GCTGCAGGAG 

790 800 810 820 830 840 

AGCGAGCAGC AGCTGGACCA CCTGATGAAT ATCTGTACTA CGCAGCTGCG CCTGCTCTCC 

850 860 870 880 890 900 

GAGGACACTG ACAGCCAGCG CCTGGCCTAC GTGACGTGTC AGGACCTTCG TAGCATTGCA 

910 920 930 940 950 960 

GACCCTGCAG AGCAGATGGT TATGGTGATC AAAGCCCCTC CTGAGACCCA GCTCCAAGCC 

970 980 990 1000 1010 1020 

GTGGACTCTT CGGAGAACTT TCAGATCTCC CTTAAGAGCA AACAAGGCCC GATCGATGTT 

1030 1040 1050 1060 1070 1080 

TTCCTGTGCC CTGAGGAGAC CGTAGGTGGG ATCAGCCCTG GGAAGACCCC ATCCCAGGAG 

1090 1100 1110 1120 1130 1140 

GTCACTTCTG AGGAGGAGAA CAGGGCCACT GACTCTGCCA CCATAGTGTC ACCACCACCA 

1150 1160 1170 1180 1190 1200 

TCATCTCCCC CCTCATCCCT CACCACAGAT CCCAGCCAGT CTCTACTCAG CCTGGAGCAA 

1210 1220 1230 1240 1250 1260 

GAACCGCTGT TGTCCCGGAT GGGCAGCCTG CGGGCTCCCG TGGACGAGGA CCGCCTGTCC 



FIG. 1B 
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1270 1280 1290 1300 1310 1320 

CCGCTGGTGG CGGCCGACTC GCTCCTGGAG CATGTGCGGG AGGACTTCTC CGGCCTCCTC 

1330 1340 1350 1360 1370 1380 

CCTGAGGAGT TCATCAGCCT TTCCCCACCC CACGAGGCCC TCGACTACCA CTTCGGCCTC 

1390 1400 1410 1420 1430 1440 

GAGGAGGGCG AGGGCATCAG AGACCTCTTC GACTGTGACT TTGGGGACCT CACCCCCCTG 

1450 1460 1470 1480 1490 1500 

GATTTCTGAC AGGGCTTGGA GGGACCAGGG TTTCCAGAGT AGCTCACCTT GTCTCTGCAG 

1510 1520 1530 1540 1550 1560 

CCCTGGAGCC CCCTGTCCCT GGCCGTCCTC CCAGCCTGTT TGGAAACATT TAATTTATAC 

1570 1580 1590 1600 1610 1620 

CCCTCTCCTC TGTCTCCAGA AGCTTCTAGC TCTGGGGTCT GGCTACCGCT AGGAGGCTGA 

1630 1640 1650 1660 1670 1680 

GCAAGCCAGG AAGGGAAGGA GTCTGTGTGG TGTGTATGTG CATGCAGCCT ACACCCACAC 

1690 1700 1710 1720 1730 1740 

GTGTGTACCG GGGGTGAATG TGTGTGAGCA TGTGTGTGTG CATGTACCGG GGAATGAAGG 

1750 1760 1770 1780 1790 1800 

TGAACATACA CCTCTGTGTG TGCACTGCAG ACACGCCCCA GTGTGTCCAC ATGTGTGTGC 

1B10 1820 1830 1840 1850 I860 

ATGAGTCCAT CTCTGCGCGT GGGGGGGCTC TAACTGCACT TTCGGCCCTT TTGCTCGTGG 

1870 1880 1890 1900 1910 1920 

GGTCCCACAA GGCCCAGGGC AGTGCCTGCT CCCAGAATCT GGTGCTCTGA CCAGGCCAGG 

1930 1940 1950 I960 1970 1980 

TGGGGAGGCT TTGGCTGGCT GGGCGTGTAG GACGGTGAGA GCACTTCTGT CTTAAAGGTT 

1990 2000 2010 2020 2030 2040 

TTTTCTGATT GAAGCTTTAA TGGAGCGTTA TTTATTTATC GAGGCCTCTT TGGTGAGCCT 

2050 2060 2070 2080 2090 2100 . 

GGGGAATCAG CAAAAGGGGA GGAGGGGTGT GGGGTTGATA CCCCAACTCC CTCTACCCTT 

2110 2120 2130 2140 2150 2160 

GAGCAAGGGC AGGGGTCCCT GAGCTGTTCT TCTGCCCCAT ACTGAAGGAA CTGAGGCCTG 

2170 2180 2190 2200 2210 2220 

GGTGATTTAT TTATTGGGAA AGTGAGGGAG GGAGACAGAC TGACTGACAG CCATGGGTGG 

2230 2240 2250 2260 2270 2280 

TCAGATGGTG GGGTGGGCCC TCTCCAGGGG GCCAGTTCAG GGCCCAGCTG CCCCCCAGGA 

2290 2300 2310 2320 2330 2340 

TGGATATGAG ATGGGAGAGG TGAGTGGGGG ACCTTCACTG ATGTGGGCAG GAGGGGTGGT 

2350 2360 2370 2380 2390 2400 

GAAGGCCTCC CCCAGCCCAG ACCCTGTGGT CCCTCCTGCA GTGTCTGAAG CGCCTGCCTC 

2410 2420 2430 2440 2450 2460 

CCCACTGCTC TGCCCCACCC TCCAATCTGC ACTTTGATTT GCTTCCTAAC AGCTCTGTTC 

2470 2480 2490 2500 2520 2520 

CCTCCTGCTT TGGTTTTAAT AAATATTTTG ATGACGTTAA AAAAAGGAAT TCGATAT 
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1 ttccggtttt tctcagggga cgttgaaatt atttttgtaa cgggagtcgg- gagaggacgg 
61 ggcgtgcccc gcgtgcgcgc gcgtcgccct ccccggcgct cctccacagc tcgctggctc 
121 ccgccgcgga aaggcgtcac gccgcccaaa accccccgaa aaacggccgc caccgccgcc 
181 gctgccgccg cggaaccccc ggcaccgccg ccgccgcccc ctcctgagga ggacccagag 
241 caggacagcg gcccggagga cccgcctctc gtcaggcttg agtttgaaga aacagaagaa 
301 cctgacttta ctgcattatg tcagaaatta aagataccag atcatgtcag agagagagct 
361 tggttaactt gggagaaagc ttcatctgtg gatggagtat tgggaggtta tattcaaaag 
421 aaaaaggaac tgtggggaat ctgtatcttt attgcagcag ttgacctaga tgagatgtcg 
481 ttcactttta ctgagctaca gaaaaacata gaaatcagtg tccataaatt ctttaactta 
541 ctaaaagaaa ttgataccag taccaaagtt gataatgcta tgtcaagact gttgaagaag 
601 tatgatgtat tgtttgcact cttcagcaaa ttggaaagga catgtgaact tatatatttg 
661 acacaaccca gcagttcgat atctactgaa ataaattctg cattggtgct aaaagtttct 
721 tggatcacat ttttattagc taaaggggaa gtattacaaa tggaagatga tctggtgatt 
781 tcatttcagt taatgctatg tgtccttgac tattttatta aactctcacc tcccatgttg 
841 cccaaagaac catacaaaac agcngttata cccattaatg gttcacctcg aacacccagg 
901 cgaggtcaga acaggagtgc acggatagca aaacaactag aaaacgatac aagaattatt 
961 gaagttctct gtaaagaaca tgaatgtaat atagatgagg tgaaaaatgt ctacttcaaa 
1021 aatcttacac cttttatgaa ttctcttgga cttgtaacat ctaatggact tccagaggtt 
1081 gaaaatcttt ctaaacgata cgaagaaatt tatcttaaaa ataaagatct agacgcaaga 
1141 ttatttttgg accacgataa aactcttcag accgattcta cagacagctt tgaaacacag 
1201 agaacaccac gaaaaagtaa ccttgatgaa gaggtgaatg taattcctcc acacactcca 
1261 gttaggactg ttatgaacac tatccaacaa ttaatgatga ttttaaattc agcaagtgat 
1321 caaccttcag aaaatctgat ttcctatttt aacaactgca cagtgaatcc aaaagaaagt 
1381 atactgaaaa gagtgaagga tataggatac atctttaaag agaaatttgc taaagctgtg 
1441 ■ ggacagggtt gtgtcgaaat tggatcacag cgatacaaac ttggagttcg cttgtattac 
1501 cgagtaatgg aatccatgct taaatcagaa gaagaacgat tatccattca aaattttagc 
1561 aaacttctga atgacaacat ttttcatatg tctttattgg cgtgcgctct tgaggttgta 
1621 atggccacat atagcagaag tacatctcag aatcttgatt ctggaacaga tttgtctttc 
1681 ccatggattc tgaatgtgct taatttaaaa gcctttgatt tttacaaagt gatcgaaagt 
1741 tttatcaaag cagaaggcaa cttgacaaga gaaatgataa aacatttaga acgatgtgaa 
1801 catcgaatca tggaatccct tgcatggctc tcagattcac ctttatttga tcttattaaa 
1861 caatcaaagg accgagaagg accaactgac caccttgaat ctgcttgtcc tcttaatctt 
1921 ccnctccaga ataatcacac tgcagcagat atgtatcttt ctcctgtaag atctccaaag 
1981 aaaaaaggtt caactacgcg tgtaaattct actgcaaatg cagagacaca agcaacctca 
2041 gccttccaga cccagaagcc actgaaatct acctctcttt cactgtttta taaaaaagtg 
2101 tatcggctag cctatctccg gctaaataca ctttgtgaac gccttctgtc tgagcaccca 
2161 gaattagaac atatcatctg gacccttttc cagcacaccc tgcagaatga gtatgaactc 
2221 atgagagaca ggcatttgga ccaaattatg atgtgttcca tgtatggcat atgcaaagcg 
2281 aagaatatag accttaaatt caaaatcatt gtaacagcat acaaggatct tcctcatgct 
2341 gttcaggaga cattcaaacg tgttttgatc aaagaagagg agtatgattc tattatagta 
2401 ttctataact cggtcttcat gcagagactg aaaacaaata ttttgcagta tgcttccacc 
2461 aggcccccta ccttgtcacc aatacctcac attcctcgaa gcccttacaa gtttcctagt 
2521 tcacccttac ggattcctgg agggaacatc tatatttcac ccctgaagag tccatataaa 
2581 atttcagaag gtctgccaac accaacaaaa atgactccaa gatcaagaat cttagtatca 
2641 attggtgaat cattcgggac ttctgagaag ttccagaaaa taaatcagat ggtatgtaac 
2701 agcgaccgtg tgctcaaaag aagtgctgaa ggaagcaacc ctcctaaacc actgaaaaaa 
2761 ctacgctttg atattgaagg atcagatgaa gcagatggaa gtaaacatct cccaggagag 
2821 tccaaatttc agcagaaact ggcagaaatg acttctactc gaacacgaat gcaaaagcag 
2881 aaaatgaatg atagcatgga tacctcaaac aaggaagaga aatgaggatc tcaggacctt 
2941 ggcggacact gtgtacacct ctggattcat tgtctctcac agatgtgact gtat 
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LFSKLERTCELIYLTQPSSSISTEINSALVLKVSWITFLLAKGEVLQMEDDLVISFQL 
MLCVLDYFIKLSPPMLLKEPYKTAVIPINGSPRTPRRGQNRSARIAKQLENDTRIIEV 
LCKEHECNIDEVKNVYFKNFIPFMNSLGLVTSNGLPEVENLSKRYEEIYLKNKDLDAR 
LFLDHDKTLQTDSIDSFETQRTPRKSNLDEEVNVIPPHTPVRTVMNTIQQLMMILNSA 
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>HincII 
I 

>AccI 
I I 

>BglII >SalI 

I [ 1 1 cn ' 

10 1 20 30 III 40 50 60 

| * * * ill* 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 

70 80 90 100 110 120 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 

>ApoI >MfeI 
I I 

1130 140 150 160 I 170 180 

* I * * * * # * j * * * * 
CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 

>HmcII 

>AflIII 
I 

>NruI >MluI 
I I 
190 200 1210 220 1230 

* * * * *j* * * * ] * . 
TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG TTG 

Arg Cys Thr Gly Gin lie Tyr Ala Leu> 
d d CMV PROMOTER d d > 

>SpeI >AseI 

240 250 I 260 270 280 

* * I* *|** * * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC ATT 
Thr Leu He He Asp *** Leu Leu lie Val He Asn Tyr Gly Val Ile> 

d d d d d d CMV PROMOTER d d d d d d > 



290 



300 310 320 330 



★ 



AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
d d d d d d_CMV PROMOTER d d d d d d > 

>BglI >AatII 

1 1 
340 350 360 370 I 

* * * * * * * * + 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro lie Asp Val Asn> 
d d d d d d_CMV PROMOTER d d d d d d > 



380 



390 400 410 420 



AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
d d d d d d_CMV PROMOTER d d d d d d > 
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-Aatll >B f 1 

i 30 440 450 460 ! 470 

, ; * - * * * i * * 

TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA TCA 
Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
d d d d d d_CMV PROMOTER d d d d d d > 

>NdeI >AatII 
I ' I 
480 t 490 500 510 ( 520 
I * * * * * * I 
AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 
Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin *** Arg ***> 
d d d d d d_CMV PROMOTER d d d d d d > 

>BglI 

530 | 540 550 560 570 
, * * * * * 
ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Arg Leu Ala Leu Cvs Pro Val His Asp Leu Met Gly Leu Ser> 
d d '_6 d d d_CMV PROMOTER d d d d d d > 

>BsaAI >NcoI 

I I 
>SnaBI >StyI >MslI 

I I I 

580 590 600 610 1 

* * 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg He Ser His Arg Tyr Tyr His Gly Asp> 
d d d d d d_CMV PROMOTER d d d d d d > 



620 



630 640 650 660 



GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val *** Leu Thr> 
d d d d d d_CMV PROMOTER d d d d d d > 

>AatII >BanI 
I I 
670 680 690 I 700 710 I 
* * * * * I + * * * 
GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly He Ser Lys Ser Pro Pro His *** Arg Gin Trp Glu Phe Val Leu> 
d d d d d d_CMV PROMOTER d d d d d d > 

720 730 740 750 760 

********* 
GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met- Ser *** Gin Leu Arg Pro> 

d d d d d d_CMV PROMOTER d d d d d -d -> 

770 780 790 800 810 

* 

ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
lie Asp Ala Asn Gly Arg *** Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
d d d d d d_CMV PROMOTER d d d d d d > 
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>BanII 
I 

>Sacl 
I 

> BsiHKAI 
! 

>Ecll36II 
I I 
I 820 



830 



840 



850 



CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn Arg Thr His Cys Leu Leu Ala Tyr Arg> 
d d d d d d CMV PROMOTER d d d d d d > 



>AseI 



>T7 PROMOTER 



I 



>BsaI 
t 

>Sfcl 
I 
I 

[ 



860 I 



870 



880 



>HindIII 



890 



>KpnI 

>Acc65I 
I 

>BanI 



900 



910 



AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
d_> 

>PflMI 

>PvuII 

1 

>MspAlI 1 



>EarI 
I 



, 920 930 940 I 950 960 
* | * * * * I * 
TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
e TRIPARTITE LEADER SEQUENCE e > 



>BanII 
I 

950 



970 



>EarI 
! 

| 980 



>ScaI 



990 



1000 



1010 



1020 



1030 



CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
e TRIPARTITE LEADER SEQUENCE _e > 

>SfcI 
I 

>MspAlI 
I I 

>BsiEI 



>PpuMI 
I 

>Eco0109I 
I 

I 1040 



1050 



>BsiEI 
I 

>BsaWI 
I I 
1060 



>XhoI 

I I 

>PaeR71 >EaeI t 

I \ I 

>BsoBI >NotI I 



I 

>AvaI 



1070 



! 

I 1080 



>EagI I 
I I 
I I 
I * 



1090 



CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GCGGCCGCTG 
TRIPARTITE LEADER SEQUENCE e > 
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>ApaI 



>ClaI >EcoO109I 
I I I 

>EcoRV| >Bspl20I >SfcI 



>XbaI >ApoI 

I I t I 111 > 

>PstI >EcoRI >BsiWI >BspDI M>BanII I >MslI 

|| I I i I III I I 

[ | 1100 I 1110 I 1120 I I 11130 I 1140 I 

II* * | -| * * I * I I * I * t 

CAGTCTAGAC GAATTCGCGT ACGATATCGA TGGGCCCTAT T CTA TAG TGT CAC CTA 

Leu Cys His Leu> 
SP6 PROMOTER > 



1200 



>BanII 

I 

>BsiHKAI 
I 

>SacI 

I 

>Ecll36III >BclI 
I I 

>BGH_POLY_A I 

1150 I 11160 \ U70 1180 1190 

* I * I I * I * 

AAT G CTAGAGCTCG CTGATCAGCC TCGACTGTGC CTTCTAGTTG CCAGCCATCT 

Asn> 

> 

>BanI 

1210 1220 1230 1240 1250 1260 

* * * * + * * * * * * * 

GTTGTTTGCC CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTCC CACTGTCCTT 

1270 1280 1290 1300 1310 1320 

TCCTAATAAA ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTGGGG 

>BbsI 
I 

1330 1340 1350 1360 I 1370 1380 

r * 

GGTGGGGTGG GGCAGGACAG CAAGGGGGAG GATTGGGAAG ACAATAGCCG AAATGACCGA 

>BssSI 
I 

>BspMI 

1390 1400 ! 1410 1420 1430 1440 

|* * * * 

CCAAGCGACG CCCAACCTGC CATCACGAGA TTTCGATTCC ACCGCCGCCT TCTATGAAAG 

>NaeI 
I 

>BsrFI 

I ! 
>BpmI I 

I I 

>NgoMI 
I 

1470 



1450 



1460 



1480 



1490 



1500 



GTTGGGCTTC GGAATCGTTT TCCGGGACGC CGGCTGGATG ATCCTCCAGC GCGGGGATCT 
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>BpmI 
I 

>SV4 0_early_poly_A 

1510 1520 11530 1540 1550 1560 

* t - * * * 

CATGCTGGAG TTCTTCGCCC ACCCCAACTT GTTTATTGCA GCTTATAATG GTTACAAATA 

>ApoI >BsmI 
| I 
1570 11580 1590 1600 11610 1620 

* I + * * * I * 

AAGCAATAGC ATCACAAATT TCACAAATAA AGCATTTTTT TCACTGCATT CTAGTTGTGG 

>HincII 
I 

>Bstll07I >AccI 
I t I 

>AccI >SalI 
II I M 

1630 1640 1650 1660 i I I 1670 1680 

* I i * 11* 

TTTGTCCAAA CTCATCAATG TATCTTATCA TGTCTGTATA CCGTCGACCT CTAGCTAGAG^ 

>BsrBI 
I 

1690 1700 1710 1720 1730 1740 

I* 

CTTGGCGTAA TCATGGTCAT AGCTGTTTCC TGTG7GAAAT TGTTATCCGC TCACAATTCC 
c PUC19 BACKBONE H3 TO AATII c > 

>BanI 
I 

1750 1760 1770 1780 1 1790 1800 

* * + * **|* + * * 

ACACAACATA CGAGCCGGAA GCATAAAGTG TAAAGCCTGG GGTGCCTAAT GAGTGAGCTA 
c PUC19 BACKBONE H3 TO AATII c_ > 

>AseI 

1810 1820 1830 1840 1850 I860 
* I * * * * * ■** * + 
ACTCACATTA ATTGCGTTGC GCTCACTGCC CGCTTTCCAG TCGGGAAACC TGTCGTGCCA 
c PUC19 BACKBONE H3 TO AATII_ c > 

>PvuII 

>MspAlI >AseI >EaeI >HaeII 

111 

I 1870 11880 1890 1900 1910 I 1920 

| + | * * | * * w * * * * * * 

GCTGCATTAA TGAATCGGCC AACGCGCGGG GAGAGGCGGT TTGCGTATTG GGCGCTCTTC 
c PUC19 BACKBONE H3 TO AATII c > 



>EarI 

> S ipI >BsiEI >BsrBI 

I 1930 1940 1950 I 1960 1970 1980 
j** * * * * * * 
CGCTTCCTCG CTCACTGACT CGCTGCGCTC GGTCGTTCGG CTGCGGCGAG CGGTATCAGC 
c PUC19 BACKBONE H3 TO AATII c > 
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>Af 1III 
I 

1990 2000 2010 2020 2030 2040 

* I * 

TCACTCAAAG GCGGTAATAC GGTTATCCAC AGAATCAGGG GATAACGCAG GAAAGAACAT 
c PUC19 BACKBONE H3 TO AATII c > 

2050 2060 2070 2080 2090 2100 

GTGAGCAAAA GGCCAGCAAA AGGCCAGGAA CCGTAAAAAG GCCGCGTTGC TGGCGTTTTT 
c PUC19 BACKBONE H3 TO AATII c > 

>DrdI 

I 

2110 2120 2130 2140 i 2150 2160 

★ * * * * * + * * 

CCATAGGCTC CGCCCCCCTG ACGAGCATCA CAAAAATCGA CGCTCAAGTC AGAGGTGGCG 
c PUC19 BACKBONE H3 TO AATII c > 

>BssSI 
i 

2170 2180 2190 2200 2210 ^ 2220 

AAACCCGACA GGACTATAAA GATACCAGGC GTTTCCCCCT GGAAGCTCCC TCGTGCGCTC 
c PUC19 BACKBONE H3 TO AATII c > 

>BsaWl 

2230 2240 I 2250 2260 2270 2280 

* * # + | * * + * * * * * 

TCCTGTTCCG ACCCTGCCGC TTACCGGATA CCTGTCCGCC TTTCTCCCTT CGGGAAGCGT 
c PUC19 BACKBONE H3 TO AATII S > 

>HaeII >SfcI 

j 2290 2300 I 2310 2320 2330 2340 

* * * * I * * # * * + * * 

GGCGCTTTCT CAATGCTCAC GCTGTAGGTA TCTCAGTTCG GTGTAGGTCG TTCGCTCCAA 
c PUC19 BACKBONE H3 TO AATII c > 

>BsiHKAI >MspAlI 
I I 
>ApaLI| >BsiEI >BsaWI 
II It I 
2350 I i 2360 2370 2380 2390 2400 
**|** * * * I I* * * 
GCTGGGCTGT GTGCACGAAC CCCCCGTTCA GCCCGACCGC TGCGCCTTAT CCGGTAACTA 
c PUC19 BACKBONE H3 TO AATII c > 

>AlwNI 
I 

2410 2420 2430 2440 2450 1 2460 

TCGTCTTGAG TCCAACCCGG TAAGACACGA CTTATCGCCA CTGGCAGCAG CCACTGGTAA 
c PUC19 BACKBONE H3 TO AATII c > 

>SfcI 

2470 2480 2490 I 2500 2510 2520 

CAGGATTAGC AGAGCGAGGT ATGTAGGCGG TGCTACAGAG TTCTTGAAGT GGTGGCCTAA 
c PUC19 BACKBONE H3 TO AATII c _> 
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2530 2540 2550 2560 2570 2580 

CTACGGCTAC ACTAGAAGGA CAGTATTTGG TATCTGCGCT CTGCTGAAGC CAGTTACCTT 
c PUC19 BACKBONE H3 TO AATII c > 

>Eco57I >MspAlI 

! 2590 2600 2610 2620 i 2630 2640 

; , > * i* * 

CGGAAAAAGA GTTGGTAGCT CTTGATCCGG CAAACAAACC ACCGCTGGTA GCGGTGGTTT 
c PUC19 BACKBONE H3 TO AATII c > 



2650 2660 2670 2680 2690 2700 

* 

TTTTGTTTGC AAGCAGCAGA TTACGCGCAG AAAAAAAGGA TCTCAAGAAG ATCCTTTGAT 
c PUC19 BACKBONE H3 TO AATII c > 

>BspHI 
I 

2710 2720 2730 2740 2750 2760 

I 

CTTTTCTACG GGGTCTGACG CTCAGTGGAA CGAAAACTCA CGTTAAGGGA TTTTGGTCAT 
c PUC19 BACKBONE H3 TO AATII c > 

>DraI >DraI 
[ I 
2770 2780 2790 J 2800 ^ 2810 I 2820 

GAGATTATCA AAAAGGATCT TCACCTAGAT CCTTTTAAAT TAAAAATGAA GTTTTAAATC 
c PUC19 BACKBONE H3 TO AATII c > 

>BanI 

I 

2830 2840 2850 2860 2870 2880 

* * * * * * * * + .* * I 
AATCTAAAGT ATATATGAGT AAACTTGGTC TGACAGTTAC CAATGCTTAA TCAGTGAGGC 

a AMP-ORF > 

c PUC19 BACKBONE 113 TO AATII . c 

>Ahdl 
I 

2890 2900 2910 2920 2930 2940 

* 

ACCTATCTCA GCGATCTGTC TATTTCGTTC ATCCATAGTT GCCTGACTCC CCGTCGTGTA 

a a AMP-ORF _a _ a — 

~ c " PUC19 BACKBONE H3 TO AATII c > 

>BsaI 
t 

>BsrDI >BpmI 
I I 

2950 2960 2970 2980 2990 I 3000 

* * * * * * * * + * t * 
GATAACTACG ATACGGGAGG GCTTACCATC TGGCCCCAGT GCTGCAATGA TACCGCGAGA 

a a AMP-ORF a a * 

"c ' " PUC19 BACKBONE H3 TO AATII c > 

^srFl >BG1I 

3010 3020 3030 3040 3050 3060 

* * * * * * * * + * * 

CCCACGCTCA CCGGCTCCAG ATTTATCAGC AATAAACCAG CCAGCCGGAA GGGCCGAGCG 

a a AMP-ORF a a > 

— c PUC19 BACKBONE H3 To AATII c m _ — > 
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>AseI 
i 

3070 3080 3090 3100 I 3110 3120 

CAGAAGTGGT CCTGCAACTT TATCCGCCTC CATCCAGTCT ATTAATTGTT GCCGGGAAGC 

a a AMP-ORF a a > 

~~c PUC19 BACKBONE H3 TO AATII c > 

>Pspl406I 



>FspI 
I 

3130 3140 3150 t 



>BsrDl >Sf cl 
I I 
3160 1 3170 I 3180 



TAGAGTAAGT AGTTCGCCAG TTAATAGTTT GCGCAACGTT GTTGCCATTG CTACAGGCAT 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>MslI >BsaWI 
t ! 

I 3190 3200 3210 3220 I 3230 3240 

CGTGGTGTCA CGCTCGTCGT TTGGTATGGC TTCATTCACC TCCGGTTCCC AACGATCAAG 

a a AMP-ORF a a > 

~~ c PUC19 BACKBONE H3 TO AATII c > 

>PvuI 
I 

>BsiEI 
I 

3250 3260 3270 3280 3290 3300 

. * * + * ★# * * + * + * 

GCGAGTTACA TGATCCCCCA TGTTGTGCAA AAAAGCGGTT AGCTCCTTCG GTCCTCCGAT 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 



>EaeI >MslI 
I I 
3310 3320 3330 3340 I 3350 3360 

w + *)+ + + * * I * * * 

CGTTGTCAGA AGTAAGTTGG CCGCAGTGTT ATCACTCATG GTTATGGCAG CACTGCATAA 

a a AMP-ORF a a > 

r PUC19 BACKBONE H3 TO AATII c > 



>ScaI 
I 

3370 3380 3390 3400 3410 3420 

* * # * * * * * * * + * 

TTCTCTTACT GTCATGCCAT CCGTAAGATG CTTTTCTGTG ACTIGGTGAGT ACTCAACCAA 

^ a a AMP-ORF a a > 

C PUC19 BACKBONE H3 TO AATII c _> 

>BsiEI 

3430 3440 3450 3460 3470 3480 

* * + + *j* + * ** ** 

GTCATTCTGA GAATAGTGTA TGCGGCGACC GAGTTGCTCT TGCCCGGCGT CAATACGGGA 

a a AMP-ORF a _a_ > 

~" c PUC19 BACKBONE H3 TO AATII C m > 

FIG. 4 

(CONTINUED) 



SUBSTITUTE SHEET (RULE 26) 

50OCID: <WO 982 1 22QA 1 J A> 



WO 98/21228 



PCT/US97/21821 



15/51 

>XmnI 
i 

>DraI" >BsiHKAI >Pspl406I 
I i < 

3490 3500 3510 13520 3530 3540 

- I * * I * * I * * 

TAATACCGCG CCACATAGCA GAACTTTAAA AGTGCTCATC ATTGGAAAAC GTTCTTCGGG 

a a AMP-ORF a a > 

~~ c PUC19 BACKBONE H3 TO AATII c > 

>Eco571 

i 

>ApaLI 

i 

>MspAlI >BssSI 
I I I 

3550 3560 I 3570 3580 3590 t 3600 

* * * * * 

GCGAAAACTC TCAAGGATCT TACCGCTGTT GAGATCCAGT TCGATGTAAC CCACTCGTGC 

a a AMP-ORF a a > 

c PUC19 BACKBONE H3 TO AATII c > 

>BsiHKAI 

| 3610 3620 3630 3640 3650 3660 

ACCCAACTGA TCTTCAGCAT CTTTTACTTT CACCAGCGTT TCTGGGTGAG CAAAAACAGG 

a a AMP-ORF a _a > 

c PUC19 BACKBONE H3 To AATII c > 

>MslI 
I 

3670 3680 3690 3700 3710 3720 

* * * * # * * + ★ * * * 

AAGGCAAAAT GCCGCAAAAA AGGGAATAAG GGCGACACGG AAATGTTGAA TACTCATACT 

a a AMP-ORF a a > 

~~ c PUC19 BACKBONE H3 TO AATII c > 

>EarI >SspI >BspHI >BsrBI 

II I I 

1 3730 I 3740 3750 3760 I 3770 I 3780 

* * I** * # * * 

CTTCCTTTTT CAATATTATT GAAGCATTTA TCAGGGTTAT TGTCTCATGA GCGGATACAT 
c PUC19 BACKBONE H3 TO AATII c > 

3790 3800 3810 3820 3830 3840 

* * ** + * * * ** ** 

ATTTGAATGT ATTTAGAAAA ATAAACAAAT AGGGGTTCCG CGCACATTTC CCCGAAAAGT 
c PUC19 BACKBONE H3 TO AATII c > 



>HincII 

I 

>AccI 
I I 

>AatII 
I I 

>SalI 
I I I 

3850 I I 1 
* I I I 

GCCACCTGAC GTC 
c > 
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>HincII 
I 

>AccI 
I I 

>BglII >SalI 
i IN 
10 1 20 30 I I I 40 50 60 

I * * i I I* 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 
t 

70 80 90 100 110 120 

# * * + * * * * * + * * 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 

>ApoI >MfeI 

I I 

1130 140 150 160 I 170 180 

*l* * * + * * *j* * * * 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 

>HincII 

>Af 1III 
I 

>NruI >MluI 
I i 
190 200 I 210 220 1230 ! 

* * * * *|* * * * I * t* 
TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAG GCG TTG 

Arg Cys Thr Gly Gin lie Tyr Ala Leu> 
e e CMV PROMOTER e e > 

>SpeI >AseI 
I I 

240 250 I 260 270 280 

★ *l* *|* * * * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAG GGG GTC ATT 
Thr Leu lie lie Asp *** Leu Leu He Val He Asn Tyr Gly Val Ile> 
e e e_ e e e_CMV PROMOTER e e e e e e > 

290 300 310 320 330 

* * + ★ * * * * * * 

AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro He Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
e e e e e e_CMV PROMOTER e_ e e e e e > 

>BglI >AatII 

I I 
340 350 360 370 I 

+ * * * # * * * ★ 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val Asn> 
e e e e e e_CMV PROMOTER e e e e e e > 

380 390 400 410 420 

* + * + * + * * * * 

AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asp Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
e e e e e e CMV PROMOTER e e e e e e > 
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>AatII >BglI 
I I 
430 440 450 460 I 470 

j+ * * * * * + * * » 

TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA TCA 
Ser Met Gly Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
e e e e e_CMV PROMOTER e e e e e e > 

>Ndel >AatII 

i i 

480 | 490 500 510I 520 

*l* * * * * *| * * 

AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 

Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin *** Arg 

e e e e e e_CMV PROMOTER e e e e e e > 

>BglI 

530 i 540 550 560 570 

* *l* * + * * * * 

ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Ara Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser> 
e e e e e _C MV P RO MO T E R e e e e e e > 

>BsaAI >NcoI 

I I 
>SnaBI >StyI >MslI 

I I I 

580 590 600 610 I 

* * * ** * # + * 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg He Ser His Arg Tyr Tyr His Gly Asp> 
e e e e e e_CMV PROMOTER e e e e e e > 

620 630 640 650 660 

+ * * * * ***** 

GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val *** Leu Thr> 
e e e e e e_CMV PROMOTER e e e e e e > 

>AatII >BanI 
I I 
670 680 690 i 700 710 I 

* * * ** #[* * * * 

GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly He Ser Lys Ser Pro Pro His *** Arg Gin Trp Glu Phe Val Leu> 
e e e e e e_CMV PROMOTER e e e e e e > 

720 730 740 750 760 

* + * * ** * + * 

GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser *** Gin Leu Arg Pro> 
e e e e e e_CMV PROMOTER e e e e e e > 

770 780 790 800 810 

★ ** * ****** 

ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
He Asd Ala Asn Gly Arg *** Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
e ^_e e e e e_CMV PROMOTER e e e e e e > 
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>SacI 
I 

Banll 
! 

> BsiHKAI 
I 

>Ecll36II 



l 



820 



830 



840 



850 



CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn Arg Thr His Cys Leu Leu Ala Tyr Arg> 
e *=> e e e e CMV PROMOTER e e e e e e > 



> Asel 



>BsaI 
I 

>Sf cl 



>T7 PROMOTER 



860 



I 



I 



870 



880 



>HindIII 



890 



>KpnI 

>Acc65I 
! 

>BanI 



900 



910 



AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
e > 



>EarI 

I 

I 920 



930 



>Pf 1MI 

>PvuII 
1 

940 



>BanII 
I 

950 



960 



970 



TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
f f TRIPARTITE LEADER f „ f __ > 



>EarI 



CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
_f__ f_TRI PARTITE LEADER f J > 

>BsoBI 
I 

>AvaI 
I 

>XhoI 
I 

>PaeR7I 
1 

I 1080 



980 



>ScaI 
i 

i 990 



1000 



1010 



1020 



1030 



>EcoO109I 



1050 



>BsiEI 
I 

>BsaWI 
I I 
1060 



1070 



1090 



>PduMI 
I 1040 

* i * * + * II* * * # * „ - 

CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GAACTGAAAA^ 



TRIPARTITE LEADER 
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>HincII >Eco0109I >BsaWI 

>H p aI >PpuMI >BamHI 

1100 ! U10 1120 1130 1140 I 1150 

* .J* * * ♦ * * * i * r I * 

ACCAGAAAGT TAACTGGTAA GTTTAGTCTT TTTGTCTTTT TATTTCAGGT CCCGGATCCG 

HYBRID SV40 LATE INTRON b_ > 



b 



>BseRI >StuI 
1160 !ll70 1180 1190 1200 I 1210 

* * * I * # * * * * + * w 

GTGGTGGTGC AAATCAAAGA ACTGCTCCTC AGTGGATGTT GCCTTTACTT CTAGGCCTGT 
b HYBRID SV4 0 LATE INTRON b > 

>BsiEI 
I 

>EagI| 
I i 

>EaeI[ >XbaI 

I t i 
>SacII >PstI 

it II 
>NotI t >SfcI I i 

II I t 1 

1220 1230 1240 1250 I 1260 i 1 1270 
* * * * * * * * * 1*1 * I * 
ACGGAAGTGT TACTTCTGCT CTAAAAGCTG CGGAATTGTA CCCGCGGCCG CTGCAGTCTA 
HYBRID SV40 LATE INTRON _b > 

>ApaI 
I 

>BspDI >EcoOl09I 
I I I 

>ApoI >EcoRVI >Bspl20I 

| i I I I 1 

>EcoRI >BsiWI >ClaI |>BanII >SfcI >MslI 
I I II Ml I 

I 1280 I 1290 I 11300 I 1310 1320 

t * * i * I * * I I* * * 1 * 

GACGAATTCG CGTACGATAT CGATGGGCCC TATT CTA TAG TGT CAC CTA AAT 

Leu *** Cys His Leu Asn> 
c_SP6 PROMOTER c > 

>SacI 
I 

>BanII 
I 

>BsiHKAI 
I 

>Ecll36II ! >BclI 
I i I 
>BGH_POLY_A | I 

! 1330 I I 1340 1350 1360 1370 1380 

I* I* | * * * * 

GCTAGAGC TCGCTGATCA GCCTCGACTG TGCCTTCTAG TTGCCAGCCA TCTGTTGTTT 

>BanI 

1390 1400 1410 | 1420 1430 1440. 

GCCCCTCCCC CGTGCCTTCC TTGACCCTGG AAGGTGCCAC TCCCACTGTC CTTTCCTAAT 
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1450 1460 1470 1480 1490 1500 

* * + * * * * * * * * 

AAAATGAGGA AATTGCATCG CATTGTCTGA GTAGGTGTCA TTCTATTCTG GGGGGTGGGG 

>BbsI 
I 

1510 1520 1530 1540 1550 1560 

* * * * * * +|* * * * * 

TGGGGCAGGA CAGCAAGGGG GAGGATTGGG AAGACAATAG CCGAAATGAC CGACCAAGCG 

>BspMI 
I 

>BssSI 

1570 1580 1590 1600 1610 1620 

+ * * I * * * * * * * 

ACGCCCAACC TGCCATCACG AGATTTCGAT TCCACCGCCG CCTTCTATGA AAGGTTGGGC 

>NaeI 
I 

>NgoMI 

I I 
>BpmI 

I I 
>BsrFI 

I I 

1630 1640 I .1 1650 1660 1670 1680 

+ + ★ * * * * * 

TTCGGAATCG TTTTCCGGGA CGCCGGCTGG ATGATCCTCC AGCGCCGGGA TCTCATGCTG 

>BpmI 
I 

>SV4 0_early_poly_A 

1690 1700 1710 1720 1730 1740 

* * *j* * * * * * * * * 

GAGTTCTTCG CCCACCCCAA CTTGTTTATT GCAGCTTATA ATGGTTACAA ATAAAGCAAT 

>ApoI >BsmI 

I t 
1750 1760 1770 1780 1790 1800 

* | * + * * * * | * ★ " * * 

AGCATCACAA ATTTCACAAA TAAAGCATTT TTTTCACTGC ATTCTAGTTG TGGTTTGTCC 

>HincII 
I 

>Bstll07I >AccI 

>AccI I 
1 

1810 1820 1830 



>SalI 
111 

11840 1850 1860 

* I j j * * * * + 

AAACTCATCA ATGTATCTTA TCATGTCTGT ATACCGTCGA CCTCTAGCTA GAGCTTGGCG 



> 



>BsrBI 
I 

1870 1880 1890 1900 I 1910 1920 

* * * * * * *•*[** * * 

TAATCATGGT CATAGCTGTT TCCTGTGTGA AATTGTTATC CGCTCACAAT TCCACACAAC 
d d PUC19 BACKBONE d d > 
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>BanI 
I 

1930 1940 1950 I 1960 1970 1980 

ATACGAGCCG GAAGCATAAA GTGTAAAGCC TGGGGTGCCT AATGAGTGAG CTAACTCACA 
d _d PUC19 BACKBONE d _d > 

>AseI 

i 

>AseI >PvuII t 

t I I 

I 1990 2000 2010 2020 2030 t 2040 

I** * + * * * * . * * |+ * 

TTAATTGCGT TGCGCTCACT GCCCGCTTTC CAGTCGGGAA ACCTGTCGTG CCAGCTGCAT 
d d PUC19 BACKBONE d d > 

>EarI 
I 

>EaeI >HaeII >SapI 

I I i 

2050 2060 2070 2080 2090 i 2100 

I » - * * * *• * | * ]* 

TAATGAATCG GCCAACGCGC GGGGAGAGGC GGTTTGCGTA TTGGGCGCTC TTCCGCTTCC 
d d PUC19 BACKBONE d d > 

>BsiEI >BsrBI 

I f 

2110 2120 (2130 2140 I 2150 2160 

* * * * * | * * *|* * * * 

TCGCTCACTG ACTCGCTGCG CTCGGTCGTT CGGCTGCGGC GAGCGGTATC AGCTCACTCA 
d d PUC19 BACKBONE d d > 

>AflIII 
I 

2170 2180 2190 2200 2210 2220 

* * * * + * * * * * # * 

AAGGCGGTAA TACGGTTATC CACAGAATCA GGGGATAACG CAGGAAAGAA CATGTGAGCA 
d d PUC19 BACKBONE d d > 

2230 2240 2250 2260 2270 2280 

* + + + * + * * + * * * 

AAAGGCCAGC AAAAGGCCAG GAACCGTAAA AAGGCCGCGT TGCTGGCGTT TTTCCATAGG 
d d PUC19 BACKBONE_ d d > 

>DrdI 

I" 

2290 2300 2310 2320 2330 2340 

* ★ ** *# *|* * * ** 

CTCCGCCCCC CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAGAGGTG GCGAAACCCG 
d d PUC19 BACKBONE d d > 

>BssSI 
I 

2350 2360 2370 2380 I 2390 2400 

** ** #* + # 

ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT CCCTCGTGCG CTCTCCTGTT 
d d PUC19 BACKBONE d d > 
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>BsaWI >HaeII 
I t 
2410 12420 2430 2440 2450 2460 

* * * * * * * * * * * j * 

CCGACCCTGC CGCTTACCGG ATACCTGTCC GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT 
d d PUC19 BACKBONE d __d > 

>SfcI 

! 

2470 I 2480 2490 2500 2510 2520 

* * » * * * * * * * * * 

TCTCAATGCT CACGCTGTAG GTATCTCAGT TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC 
d d PUC19 BACKBONE d d > 

>BsiHKAI 
! 

>ApaLIl >BsiEI >BsaWI 

i I t I 

| 2530 2540 2550 2560 I 2570 2580 

1*1* * * * + * *i* * * * 

TGTGTGCACG AACCCCCCGT TCAGCCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 
. d d PUC19 BACKBONE d d > 

>AlwNI 
I 

2590 2600 2610 2620 12630 2640 

** * * ** ** *j* * * 

GAGTCCAACC CGGTAAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG TAACAGGATT 
d d PUC19 BACKBONE d d > 

>SfcI 
I 

2650 2660 (2670 2680 2690 2700 

* * * ★ *[ + * * * * + ★ 

AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA AGTGGTGGCC TAACTACGGC 
d d PUC19 BACKBONE d d > 

>Eco57I 
I 

2710 2720 2730 2740 2750 2760 

* + ** * * * * ** * j * 

TACACTAGAA GGACAGTATT TGGTATCTGC GCTCTGCTGA AGCCAGTTAC CTTCGGAAAA 
d d PUC19 BACKBONE d d > 

2770 2780 2790 2800 2810 2820 

** * * ** * * ★ * ** 

AGAGTTGGTA GCTCTTGATC CGGCAAACAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT 
d d PUC19 BACKBONE d d > 

2830 2840 2850 2860 2870 2880 

* * * * * * * * * * ** 

TGCAAGCAGC AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT 
d d PUC19 BACKBONE^ d d > 

>BspHI 
f 

2890 2900 2910 2920 2930 2940 

* * * * * *■ * * ** ** 

ACGGGGTCTG ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT CATGAGATTA 
d d PUC19 BACKBONE d d > 
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>DraI >DraI 
I I 
2950 2960 2970 2980 2990 3000 

TCAAAAAGGA TCTTCACCTA GATCCTTTTA AATTAAAAAT GAAGTTTTAA ATCAATCTAA 
d d PUC19 BACKBONE _d d > 

>BanI - 
I 

3010 3020 3030 3040 3050 I 3060 

* * r * *■ * + * * * j * * 

AGTATATATG AGTAAACTTG GTCTGACAGT TACCAATGCT T7\ATCAGTGA GGCACCTATC 

_a AMP-ORF a > 

d d PUC19 BACKBONE d d > 

>AhdI 

I 

3070 3080 3090 3100 I 3110 3120 

* * + * + * *+|+* + * 

TCAGCGATCT GTCTATTTCG TTCATCCATA GTTGCCTGAC TCCCCGTCGT GTAGATAACT 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 

>BsaI 
I 

>BsrDI >BpmI 
I I 

3130 3140 3150 3160 1 3170 I 3180 

* + * * * * + * 

ACGATACGGG AGGGCTTACC ATCTGGCCCC AGTGCTGCAA TGATACCGCG AGACCCACGC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>BsrFI >BglI 
! I 

| 3190 3200 3210 3220 I 3230 3240 

|*4 * * ★ * * * | * * * * 

TCACCGGCTC CAGATTTATC AGCAATAAAC CAGCCAGCCG GAAGGGCCGA GCGCAGAAGT 

a a AM P-ORF a a > 

d d PUC19 BACKBONE d d > 



>AseI 
i 

3250 3260 3270 I 3280 3290 3300 

** * * * + * • * * * * * 

GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT GTTGCCGGGA AGCTAGAGTA 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 

>Pspl406I 

! 

>FspI I >BsrDI >SfcI >MslI 

It I I i 

3310 3320 I 3330 3340 i 3350 t 3360 

★ + * * #|* *|* |* * * * 

AGTAGTTCGC CAGTTAATAG TTTGCGCAAC GTTGTTGCCA TTGCTACAGG CATCGTGGTG 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d _> 
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>BsaWI 
i 

3370 3380 3390. I 3400 3410 3420 

I* * * * 

TCACGCTCGT CGTTTGGTAT GGCTTCATTC AGCTCCGGTT CCCAACGATC AAGGCGAGTT 

a a AM P-ORF a __a > 

~ d d FUC19 BACKBONE d _d > 

>BsiEI 
i 

>PvuI 

I 

3430 3440 3450 3460 3470 I 3480 

* * * * * * * * 

ACATGATCCC CCATGTTGTG CAAAAAAGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 

>EaeI >MslI 
I I 
3490 I 3500 3510 1 3520 3530 3540 

* * I * * * * I** * » 

AGAAGTAAGT TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA TAATTCTCTT 
a a AMP-ORF a a > 



"d d PUC19 BACKBONE 



>ScaI 

i 

3500 3560 3570 3580 I 3590 3600 

ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG AGTACTCAAC CAAGTCATTC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>BsiEI 
I 

3610 3620 I 3630 3640 3650 3660 

* # * * j * * * * * * * * 

TGAGAATAGT GTATGCGGCG ACCGAGTTGC TCTTGCCCGG CGTCAATACG GGATAATACC 

a a AMP-ORF a a > 

d d PUC19 BACKBONE d d > 



>Psd1406I 
* i 

>DraI >BsiHKAI >XmnI 
I I i 

3670 3680 3690 3700 I 3710 3720 

* * * * ★ I * * * 

GCGCCACATA GCAGAACTTT AAAAGTGCTC ATCATTGGAA AACGTTCTTC GGGGCGAAAA 

a AMP-ORF a a > 



_a 

d d PUC19 BACKBONE 



>ApaLI 

I 

>Eco57I 
t 

>BssSI t >BsiHKAI 
I I i 

3730 3740 3750 3760 3770 1 3780 

* * * * + * * + * | * |** 

CTCTCAAGGA TCTTACCGCT GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC 

a a _AMP-ORF a a > 

d d PUC19 BACKBONE d d > 
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3790 3800 3810 3820 3830 3840 

*■ * * * * * + * * + **■ 

TGATCTTCAG CATCTTTTAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC AGGAAGGCAA 
a a AMP-ORF a a > 



PUC19 BACKBONE d d_ 



>MslI >EarI 

I I 

3850 3860 3870 I 3880 3890 3900 

+ * * * * * | * * * + *j* 

AATGCCGCAA AAAAGGGAAT AAGGGCGACA CGGAAATGTT GAATACTCAT ACTCTTCCTT 

a a AMP-ORF a > 

~ d d PUC19 BACKBONE d d > 

vsspl >BspHI >BsrBI 

T 1 I 

3910 3920 3930 3940 I 3950 3960 

* 1+ * * * * *|* |* + * * 

TTTCAATATT ATTGAAGCAT TTATCAGGGT TATTGTCTCA TGAGCGGATA CATATTTGAA 
d d PUC19 BACKBONE d d > 

3970 3980 3990 4000 4010 4020 

** * * ** * * + + ** 

TGTATTTAGA AAAATAAACA AATAGGGGTT CCGCGCACAT TTCCCCGAAA AGTGCCACCT 
d d PUC19 BACKBONE d d > 

>HincII 
I 

>AatII 
I I 

>AccI 
I t 

>SalI 
I I [ 
I* I 
GACGTC 

> 
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>HincII 
I 

> AccI 
I I 

>BolII >SalI 
i Ml 
10 I 20 30 I [ | 40 50 60 

, j * - I I I* 

GACGGATCGG GAGATCTCCC GATCCCCTAT GGTCGACTCT CAGTACAATC TGCTCTGATG 

>AlwNI 
I 

70 80 90 100 HO 120 

* + * + * * * * * ★ * * 

CCGCATAGTT AAGCCAGTAT CTGCTCCCTG CTTGTGTGTT GGAGGTCGCT GAGTAGTGCG 

>AdoI >MfeI 

I I 

1130 140 150 160 I 170 180 

*l* * * * + + + * 

CGAGCAAAAT TTAAGCTACA ACAAGGCAAG GCTTGACCGA CAATTGCATG AAGAATCTGC 

>HincII 

>AflIII 
I 

>NruI >MluI 
I I 
190 200 1210 220 1230 

* * * * *|* * * * | * 

TTAGGGTTAG GCGTTTTGCG CTGCTTCG CGA TGT ACG GGC CAG ATA TAC GCG TTG 

Arg Cys Thr Gly Gin He Tyr Ala Leu> 
f f CMV PROMOTER f f > 

>SpeI >AseI 
I I 

240 250 I 260 270 280 

* *l* * | ★ + * * * 

ACA TTG ATT ATT GAC TAG TTA TTA ATA GTA ATC AAT TAC GGG GTC ATT 
Thr Leu He He Asp *** Leu Leu He Val He Asn Tyr Gly Val Ile> 
f f f f f f_CMV PROMOTER £ f f f f f > 

290 300 310 320 330 

* * # * * * * ★ * * 

AGT TCA TAG CCC ATA TAT GGA GTT CCG CGT TAC ATA ACT TAC GGT AAA 
Ser Ser *** Pro lie Tyr Gly Val Pro Arg Tyr He Thr Tyr Gly Lys> 
f f f f f f_CMV PROMOTER f_ f f f f f > 

>BglI >AatII 

I I 
340 350 360 370 1 

+ * * + + * # * * 

TGG CCC GCC TGG CTG ACC GCC CAA CGA CCC CCG CCC ATT GAC GTC AAT 
Trp Pro Ala Trp Leu Thr Ala Gin Arg Pro Pro Pro He Asp Val Asn> 
f f f f f_ f_CMV PROMOTER f f f f f f > 

380 390 400 410 420 

** * ** * * + * * 

AAT GAC GTA TGT TCC CAT AGT AAC GCC AAT AGG GAC TTT CCA TTG ACG 
Asn Asd Val Cys Ser His Ser Asn Ala Asn Arg Asp Phe Pro Leu Thr> 
f " f f f f f CMV PROMOTER f f f f f f > 
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>AatII >B 9 11 
I I 
430 440 450 460 i 470 

I* * * * * * * | * * * 

TCA ATG GGT GGA CTA TTT ACG GTA AAC TGC CCA CTT GGC AGT ACA TCA 
Ser Met Glv Gly Leu Phe Thr Val Asn Cys Pro Leu Gly Ser Thr Ser> 
f f f f f_CMV PROMOTER f f f f f f > 

>NcleI >AatII 
i i 
480 i 490 500 5101 520 
»j* + + + * * | * * 
AGT GTA TCA TAT GCC AAG TAC GCC CCC TAT TGA CGT CAA TGA CGG TAA 
Ser Val Ser Tyr Ala Lys Tyr Ala Pro Tyr *** Arg Gin Arg 
f f f f f f_CMV PROMOTER f f f f £ f > 

>BglI 
I 

530 I 540 550 560 570 

* + j# * * * * * * * 

ATG GCC CGC CTG GCA TTA TGC CCA GTA CAT GAC CTT ATG GGA CTT TCC 
Met Ala Arg Leu Ala Leu Cys Pro Val His Asp Leu Met Gly Leu Ser> 
f f f f f f_CMV PROMOTER f f f i f r " > 

>BsaAI >StyI 

I t 
>SnaBI >NcoI >MslI 

t 1 I 

580 590 600 610 t 

* + * * * * * ★ * 

TAC TTG GCA GTA CAT CTA CGT ATT AGT CAT CGC TAT TAC CAT GGT GAT 
Tyr Leu Ala Val His Leu Arg He Ser His Arg Tyr Tyr His Gly Asp> 
f f f f f f_CMV PROMOTER f f f f f * > 

620 630 640 650 660 

* * * * + * * * * * 

GCG GTT TTG GCA GTA CAT CAA TGG GCG TGG ATA GCG GTT TGA CTC ACG 
Ala Val Leu Ala Val His Gin Trp Ala Trp He Ala Val *** Leu Thr> 
f f f f f f_CMV PROMOTER f f f f f f > 

>AatII >BanI 
I I 
670 680 690 ! 700 710 I 

* * * * * * | * * * * 

GGG ATT TCC AAG TCT CCA CCC CAT TGA CGT CAA TGG GAG TTT GTT TTG 
Gly He Ser Lys Ser Pro Pro His *** Arg Gin Trp Glu Phe Val Leu> 
f f f f f f_CMV PROMOTER _f f f f f f > 

720 730 740 750 760 

+ ** * * * * ** 

GCA CCA AAA TCA ACG GGA CTT TCC AAA ATG TCG TAA CAA CTC CGC CCC 
Ala Pro Lys Ser Thr Gly Leu Ser Lys Met Ser *** Gin Leu Arg Pro> 
f f f f f f_CMV PROMOTER f f f f £ f > 

770 780 790 800 810 
********** 
ATT GAC GCA AAT GGG CGG TAG GCG TGT ACG GTG GGA GGT CTA TAT AAG 
He Asp Ala Asn Gly Arg *** Ala Cys Thr Val Gly Gly Leu Tyr Lys> 
f _f f f f f_CMV PROMOTER f f £ f f f > 
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>BsiHKAI 
I 

Sad 
I 

Banll 
I 

>Ecll36II 
! t 

| 820 830 840 850 

+ I I # * * * * *■ * * 

CAG AGC TCT CTG GCT AAC TAG AGA ACC CAC TGC TTA CTG GCT TAT CGA 
Gin Ser Ser Leu Ala Asn *** Arg Thr His Cys Leu Leu Ala Tyr Arg> 



f CMV PROMOTER 



>AseI 

I 

>T7 PROMOTER 



>BsaI 
I 

>SfcI 



860 



870 



I 



>HindIII 

i 
I 
I 

880 890 



f f f 

>KpnI 

>BanI 
I 

>Acc65I 



f 



900 



910 



AAT T AATACGA CTCACTATAG GGAGACCCAA GCTTCGCGCG GGTACCACTC 
Asn Xxx> 
f > 



>Pf 1MI 

>EarI >PvuII| >BanII 

I III 

| 920 930 940 I 950 960 970 

#1+ * + *|*|* |* + + * * 

TCTTCCGCAT CGCTGTCTGC GAGGGCCAGC TGTTGGGCTC GCGGTTGAGG ACAAACTCTT 
g TRIPARTITE LEADER SEQUENCE g > 



>EarI 
I 



I 



>ScaI 



980 



990 



1000 



1010 



1020 



1030 



CGCGGTCTTT CCAGTACTCT TGGATCGGAA ACCCGTCGGC CTCCGAACGG TACTCCGCCA 
g TRIPARTITE LEADER SEQUENCE g > 



>EcoO109I 
I 

>PpuMI 
I 

I 1040 



>BsiEI 



>XhoI 
I 

>AvaI 
I 

>BsoBI 
I 

>PaeR7I 



>BsaWI 

I I I 

1050 1060 1070 1 1080 1090 

* I * * * *||* * * * + * * 

CCGAGGGACC TGAGCGAGTC CGCATCGACC GGATCGGAAA ACCTCTCGAG GAACTGAAAA 
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TRIPARTITE LEADER SEQUENCE g > 



>HpaI >PpuMI 

I i 
>HincII >Eco0109I 

I I 
1100 I 1110 1120 1130 1140 1150 

* | * * •* * * *i* 

ACCAGAAAGT TAACTGGTAA GTTTAGTCTT TTTGTCTTTT TATTTCAGGT CCCGGATCTG 
b HYBRID SV4 0 LATE 1MTRON b b > 

>PpulOI 
I 

>21_bp_tandem_repea- 'IJ 110] ,[ 1.02] ,[ 112] I 

1160 1170 1180 '0 1200 1210 

★ * * * # + * * * | * 

AGTTAGGGCG GGACATGGGC GGAGTTAGGG GCG' ,T GGTTGCTGAC TAATTGAGAT 
< h h_EAl ANA h 

>SphI 
"l 

>MsiI 
I 

I <72 bp tandem repeat_enhancer_sequence_ 

I ~ " " t 

| 1220 1230 1240 1250 1260 1 1270 

* * * * * + * * * . * | * * 

GCATGCTTTG CATACTTCTG CCTGCTGGGG AGCCTGGGGA CTTTCCACAC CTGGTTGCTG 
< h h EARLY MRNA h h 

>NsiI 
! 

>PpulOI |>SphI 

1280 I ! 1290 1300 1310 1320 1330 

★ * ★ *-» * * * * 

ACTAATTGAG ATGCATGCTT TGCATACTTC TGCCTGCTGG GGAGCCTGGG GACTTTCCAC ^ 
< h h EARLY MRNA h h 

>PvuII >BsaWI >BseRI 

I I t 

<72_bp_tandem_repeat_enhancer_sequence_B_ 

<T_antigen_binding_site_II_ I 

I III I 

| 1340 1350 I 1360 (13-70 1380 1390 

I* * * * * ★ *j* * * *.|* 

ACCCTAACTG ACACACATTC CACAGCTGGT TCTTTCAGAT CCGGTGGTGG TGCAAATCAA 

HYBRID SV40 _> 

< h EARLY MRNA h 

MINOR LATE 19S > 

>StuI 
I 

1400 1410 1420 1430 1440 1450 

# *■ » * * * *|* * * * * 

AGAACTGCTC CTCAGTGGAT GTTGCCTTTA CTTCTAGGCC TGTACGGAAG TGTTACTTCT 
c HYBRID SV4 0 LATE INTRON _c > 
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>BsiEI 



1460 



1470 



>NotI f 

I I 
>EaeI i 

I I 
>SacII I 

I ! II 
>EagI |>SfcI I I 

I I ! I I 
1480 t i 1490 



>XbaI 

I 

>PstI 



>EcoRI 
I 

>ApoI >BsiWI 



1500 



I 1510 



GCTCTAAAAG CTGCGGAATT GTACCCGCGG CCGCTGCAGT CTAGACGAAT TCGCGTACGA 
HYBRID SV40 LATE INT > 



>BspDI 
I 



>ClaI 
>EcoRV| 



>AoaI 
"l 

>BanII 

1 

>Eco0109I 



>Bsd120I 
M i 
I I i 



>SfcI 
I 
I 



>MslI 



>SacI 

1 

>BsiHKAI 
I 

>BanII 

1 

>Ecll36II I >BclI 



>BGH_POLY_A 
I 

1550 



1560 



III I 
1520 t 1530 1540 

..'11*1* I * 

TATCGATGGG CCCTATT CTA TAG TGT CAC CTA AAT GCTAG AGCTCGCTGA 
Leu *** Cys His Leu Asn> 
d_SP6 PROMOTER d > 

1570 1580 1590 1600 1610 1620 

* * * * * « * * * + * * 

TCAGCCTCGA CTGTGCCTTC TAGTTGCCAG CCATCTGTTG TTTGCCCCTC CCCCGTGCCT 

>BanI 
I 

1630 11640 1650 1660 1670 1680 

- *■ *j* * + * + + * * * 

TCCTTGACCC TGGAAGGTGC CACTCCCACT GTCCTTTCCT AATAAAATGA GGAAATTGCA 
1690 1700 1710 1720 1730 1740 

* * ★ # * * * * * * * * 

TCGCATTGTC TGAGTAGGTG TCATTCTATT CTGGGGGGTG GGGTGGGGCA GGACAGCAAG 



>BspMl 
I 

>BSSSI 
I 

1800 



>BbsI 

i 

1750 1760 1770 1780 1790 

+ * *# * * **■ * * 

GGGGAGGATT GGGAAGACAA TAGCCGAAAT GACCGACCAA GCGACGCCCA ACCTGCCATC 
1810 1820 1830 1840 1850 1860 

»* ** ** ** **■ 

ACGAGATTTC GATTCCACCG CCGCCTTCTA TGAAAGGTTG GGCTTCGGAA TCGTTTTCCG 
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>NaeI 

I 

>BpmI 

I ! 
>BsrFI 

I I 
NgoMI 

! 1870 1880 1890 1900 1910 1920 

GGACGCCGGC TGGATGATCC TCCAGCGCGG GGATCTCATG CTGGAGTTCT TCGCCCACCC 

>BpmI >A P Ql 
I 

>SV40_early_poly_A 

j 1930 1940 1950 I960 1970 I 1980 

l+* * * + * ** * * 

CAACTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC AATAGCATCA CAAATTTCAC 

>BsmI 

1990 2000 I 2010 2020 2030 2040 

* * + +|** # * + # * + 

AAATAAAGCA TTTTTTTCAC TGCATTCTAG TTGTGGTTTG TCCAAACTCA TCAATGTATC 

>HincII 
I 

>Bstll07I >AccI 
I 1 t 

>AccI >SalI 
II Ml 

2050 I I 2060 [ 2070 2080 2090 2100 

+ * | | * | * | * * * * * * * * 

TTATCATGTC TGTATACCGT CGACCTCTAG CTAGAGCTTG GCGTAATCAT GGTCATAGCT 

PUC19 BACKBONE > 

>BsrBI 
I 

2110 2120 I 2130 2140 2150 2160 

* * * * * * * * * * *w 

GTTTCCTGTG TGAAATTGTT ATCCGCTCAC AATTCCACAC AACATACGAG CCGGAAGCAT 
e e PUC19 BACKBONE e e > 

>BanI >AseI 
I I 
2170 2180 2190 2200 I 2210 2220 

* * *|* * * * * j** * ★ 

AAAGTGTAAA GCCTGGGGTG CCTAATGAGT GAGCTAACTC ACATTAATTG CGTTGCGCTC 
e__ e PUC19 BACKBONE e e > 

> Pvu II > Ase I > Eae I 

I I i 

2230 2240 2250 2260 I 2270 I 2280 

* * * * * * * j * * 

ACTGCCCGCT TTCCAGTCGG GAAACCTGTC GTGCCAGCTG CATTAATGAA TCGGCCAACG 
e e PUC19 BACKBONE e e > 
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>SapI 
I 

>HaeII >EarI 
I ! 

2290 2300 2310 I 2320 2330 2340 

* * * * * * ★ + * 

CGCGGGGAGA GGCGGTTTGC GTATTGGGCG CTCTTCCGCT TCCTCGCTCA CTGACTCGCT 
e e PUC19 BACKBONE e e > 

>BsiEI >BsrBI 

i I 
2350 2360 12370 2380 2390 2400 

+ j+ * * *| + *• ★ + * * * 

GCGCTCGGTC GTTCGGCTGC GGCGAGCGGT ATCAGCTCAC TCAAAGGCGG TAATACGGTT 
e e PUC19 BACKBONE e _e > 

>AflIII 
I 

2410 2420 2430 I 2440 2450 2460 

** * * **|** * * + * 

ATCCACAGAA TCAGGGGATA ACGCAGGAAA GAACATGTGA GCAAAAGGCC AGCAAAAGGC 
e e PUC19 BACKBONE e e > 

2470 2480 2490 2500 2510 2520 

* ★ » * * + * * + + # + 

CAGGAACCGT AAAAAGGCCG CGTTGCTGGC GTTTTTCCAT AGGCTCCGCC CCCCTGACGA 
e e PUC19 BACKBONE e_ e > 

>DrclI 
I 

2530 2540 I 2550 2560 2570 2580 

+ + * ★ | * * + + * * * * 

GCATCACAAA AATCGACGCT CAAGTCAGAG GTGGCGAAAC CCGACAGGAC TATAAAGATA 
e e PUC19 BACKBONE e e > 

>BssSI >BsaWI 
I t 
2590 2600 12610 2620 2630 2640 

* * ★ * * | * # * * * * I * 
CCAGGCGTTT CCCCCTGGAA GCTCCCTCGT GCGCTCTCCT GTTCCGACCC TGCCGCTTAC 
e e PUC19 BACKBONE e e > 

>HaeII >SfcI 
I I 

2650 2660 2670 2680 I 2690 2700 

* + * * + * *- *|* * * | * 

CGGATACCTG TCCGCCTTTC TCCCTTCGGG AAGCGTGGCG CTTTCTCAAT GCTCACGCTG 
e e PUC19 BACKBONE e e > 

>BsiHKAI 

>ApaLI 
I 

2710 2720 2730 2740 2750 I 2760 

* * ++ ** + # *|* 

TAGGTATCTC AGTTCGGTGT AGGTCGTTCG CTCCAAGCTG GGCTGTGTGC ACGAACCCCC 
e e PUC19 BACKBONE e e__ > 

FIG. 8 

(CONTINUED) 



SUBSTITUTE SHEET (RULE 26) 

500CID: <WO 982122SA1 IA> 



WO 98/21228 



PCT/US97/21821 



35/51 



>BsiEI >BsaWI . 

I I 

2770 I 2780 1 2790 2800 2810 2820 

* *l* * +|* * + * » + * 

CGTTCAGCCC GACCGCTGCG CCTTATCCGG TAACTATCGT CTTGAGTCCA ACCCGGTAAG ■ 
e e PUC19 BACKBONE e e > 

>AlwNI 

t 

2830 2840 2850 2860 2870 2880 

ACACGACTTA TCGCCACTGG CAGCAGCCAC TGGTAACAGG ATTAGCAGAG CGAGGTATGT 
e e PUC19 BACKBONE e e > 

>SfcI 

2890 2900 2910 2920 2930 2940 

* I * * * * * * * * * * * 

AGGCGGTGCT ACAGAGTTCT TGAAGTGGTG GCCTAACTAC GGCTACACTA GAAGGACAGT 
e e PUC19 BACKBONE e e > 

>Eco57T 
! 

2950 2960 2970 2980 I 2990 3000 

* * # * + * + *|** * # 

ATTTGGTATC TGCGCTCTCC TGAAGCCAGT TACCTTCGGA AAAAGAGTTG GTAGCTCTTG 
e e PUC19 BACKBONE e e > 

3010 3020 3030 3040 3050 3060 

* * + # # * + * + * * * 

ATCCGGCAAA CAAACCACCG CTGGTAGCGG TGGTTTTTTT GTTTGCAAGC AGCAGATTAC 
e e PUC19 BACKBONE e e > 

3070 3080 3090 3100 3110 3120 

+ * * * * * * # * + 

GCGCAGAAAA AAAGGATCTC AAGAAGATCC TTTGATCTTT TCTACGGGGT CTGACGCTCA 
e e PUC19 BACKBONE e e > 

>BspHI 
I 

3130 3140 3150 t 3160 3170 3180 

GTGGAACGAA AACTCACGTT AAGGGATTTT GGTCATGAGA TTATCAAAAA GGATCTTCAC 
e e PUC19 BACKBONE e e > 

>DraI >DraI 

I t 

3190 | 3200 3210 I 3220 3230 3240 

* * I * + # # |- * * ** * + 

CTAGATCCTT TTAAATTAAA AATGAAGTTT TAAATCAATC TAAAGTATAT ATGAGTAAAC 
e e PUC19 BACKBONE e e > 

>BanI 
I 

3250 3260 3270 i 3280 3290 3300 

** ** ** |** * * ** 

TTGGTCTGAC AGTTACCAAT GCTTAATCAG TGAGGCACCT ATCTCAGCGA TCTGTCTATT 

a a AMP-ORF_a a > 

e e PUC19 BACKBONE e e > 
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>AhdI 
I 

3310 3320 13330 3340 3350 3360 

TCGTTCATCC ATAGTTGCCT GACTCCCCGT CGTGTAGATA ACTACGATAC GGGAGGGCTT 

a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 

>BsaI 
I 

>BsrDI >BpmI >BsrFI 

I I I 

3370 3380 3390 13400 13410 3420 

* * *■ * #1* *j* * | * * » 

ACCATCTGGC CCCAGTGCTG CAATGATACC GCGAGACCCA CGCTCACCGG CTCCAGATTT 

a a AMP-ORF a a > 

"~ e e PUC19 BACKBONE e e > 

>BglI 
I 

3430 3440 13450 3460 3470 3480 

★ * ** *|# * * + * * * 

ATCAGCAATA AACCAGCCAG CCGGAAGGGC CGAGCGCAGA AGTGGTCCTG CAACTTTATC 

. a a AMP-ORF a a > 

e e PUC19 BACKBONE e e > 



>AseI 
I 

3490 3500 3510 3520 3530 3540 

* * *|* + + ■ * + + * * * 
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